<a href="https://colab.research.google.com/github/ummeamunira/NLP-LLM/blob/main/Text-classification/Text_Classifier_for_Sorting_Customer_Quote_Requests.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In a manufacturing company, the sales team receives numerous customer quote requests via emails and chats. These requests need to be categorized by product type (e.g., valves, pumps) and by complexity (simple or complex). Manually sorting these requests is time-consuming and inefficient. A text classifier can automate this process, enabling sales representatives to prioritize and respond faster, ultimately boosting sales conversion rates.

**Goal:**
Develop a text classifier to automatically categorize customer quote requests by product type and complexity. This will help sales reps prioritize and manage their workload more efficiently.

**Data Collection:**

Collect a dataset of customer quote requests, including emails and chat transcripts. Label each request with the corresponding product type and complexity.

In [None]:
import pandas as pd

# Example dataset
data = {
    'request': [
        "I need a quote for industrial valves.",
        "Looking for pricing on high-capacity pumps.",
        "Can you provide a detailed quote for custom-made valves?",
        "Need information on replacement parts for pumps.",
        "Requesting a simple quote for standard valves.",
        "Urgently need a quote for complex pump systems.",
        "Looking for quotes on basic valves.",
        "Can you provide pricing on advanced pump models?"
    ],
    'product_type': [
        "valves", "pumps", "valves", "pumps",
        "valves", "pumps", "valves", "pumps"
    ],
    'complexity': [
        "simple", "simple", "complex", "simple",
        "simple", "complex", "simple", "complex"
    ]
}

df = pd.DataFrame(data)


**Data Preprocessing:**

Clean and preprocess the text data, including tokenization, removing stop words, and converting text to numerical features using techniques like TF-IDF.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer

# Split the data into training and testing sets for both classifiers
X_train, X_test, y_train_product, y_test_product = train_test_split(df['request'], df['product_type'], test_size=0.2, random_state=42)
_, _, y_train_complexity, y_test_complexity = train_test_split(df['request'], df['complexity'], test_size=0.2, random_state=42)

# Vectorize the text data using TF-IDF
vectorizer = TfidfVectorizer(stop_words='english')
X_train_tfidf = vectorizer.fit_transform(X_train)
X_test_tfidf = vectorizer.transform(X_test)


**Model Training:**

Train machine learning models on the preprocessed data to classify requests by product type and complexity.

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report

# Train the product type classifier
product_model = Pipeline([
    ('vectorizer', TfidfVectorizer(stop_words='english')),
    ('classifier', LogisticRegression(random_state=42))
])

product_model.fit(X_train, y_train_product)

# Train the complexity classifier
complexity_model = Pipeline([
    ('vectorizer', TfidfVectorizer(stop_words='english')),
    ('classifier', LogisticRegression(random_state=42))
])

complexity_model.fit(X_train, y_train_complexity)


**Evaluation:**

Evaluate the models using appropriate metrics like accuracy, precision, recall, and F1-score.

In [None]:
# Predict the categories of the test set for product type
y_pred_product = product_model.predict(X_test)
print("Product Type Classification Report:")
print(classification_report(y_test_product, y_pred_product))

# Predict the categories of the test set for complexity
y_pred_complexity = complexity_model.predict(X_test)
print("Complexity Classification Report:")
print(classification_report(y_test_complexity, y_pred_complexity))


Product Type Classification Report:
              precision    recall  f1-score   support

       pumps       0.00      0.00      0.00       2.0
      valves       0.00      0.00      0.00       0.0

    accuracy                           0.00       2.0
   macro avg       0.00      0.00      0.00       2.0
weighted avg       0.00      0.00      0.00       2.0

Complexity Classification Report:
              precision    recall  f1-score   support

     complex       0.00      0.00      0.00         1
      simple       0.50      1.00      0.67         1

    accuracy                           0.50         2
   macro avg       0.25      0.50      0.33         2
weighted avg       0.25      0.50      0.33         2



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


**Deployment:**

Deploy the models to classify new incoming requests in real-time.

In [None]:
from flask import Flask, request, jsonify
import joblib

app = Flask(__name__)

# Save the trained models
joblib.dump(product_model, 'product_classifier_model.pkl')
joblib.dump(complexity_model, 'complexity_classifier_model.pkl')

# Load the models
product_model = joblib.load('product_classifier_model.pkl')
complexity_model = joblib.load('complexity_classifier_model.pkl')

@app.route('/classify', methods=['POST'])
def classify():
    data = request.get_json(force=True)
    request_text = data['request']

    # Predict the product type and complexity
    product_type = product_model.predict([request_text])[0]
    complexity = complexity_model.predict([request_text])[0]

    return jsonify({'product_type': product_type, 'complexity': complexity})

if __name__ == '__main__':
    app.run(debug=True)


To use the classifier, send a POST request to the /classify endpoint with a customer quote request:

In [None]:
curl -X POST -H "Content-Type: application/json" -d '{"request": "Looking for pricing on high-capacity pumps."}' http://127.0.0.1:5000/classify
