# Conversational Language Understanding: Intent Detection & Entity Extraction

This notebook demonstrates how to use Hugging Face Transformers for intent detection and entity extraction in conversational text.

In [1]:
# Install required packages
!pip install transformers torch



## Intent Detection

Intent detection helps identify what a user wants to accomplish with their utterance. We'll use zero-shot classification to determine the intent without training a specific model.

In [None]:
from transformers import pipeline

# Initialize the zero-shot classification pipeline for intent detection
intent_classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')

# Define possible intents
candidate_labels = ['book_flight', 'check_weather', 'play_music', 'order_food', 'cancel_booking', 'get_directions']

# Example user utterance
text = "I want to book a flight to Paris next week."

# Perform intent detection
result = intent_classifier(text, candidate_labels)

print("User Input:", text)
print("\nIntent Classification Results:")
for label, score in zip(result['labels'], result['scores']):
    print(f"  {label}: {score:.4f}")
print(f"\nPredicted Intent: {result['labels'][0]} (confidence: {result['scores'][0]:.4f})")

config.json: 0.00B [00:00, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:  15%|#5        | 294M/1.92G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


User Input: I want to book a flight to Paris next week.

Intent Classification Results:
  book_flight: 0.4001
  get_directions: 0.2047
  check_weather: 0.1389
  play_music: 0.1300
  order_food: 0.0984
  cancel_booking: 0.0278

Predicted Intent: book_flight (confidence: 0.4001)


: 

## Entity Extraction

Entity extraction identifies and extracts specific pieces of information (entities) from user input, such as locations, dates, names, etc.

In [None]:
# Initialize the NER pipeline for entity extraction
ner_pipeline = pipeline('ner', model='dbmdz/bert-large-cased-finetuned-conll03-english', grouped_entities=True)

# Example text with various entities
text = "Book a flight from New York to Paris on June 10th for John Smith."

# Extract entities
entities = ner_pipeline(text)

print("User Input:", text)
print("\nExtracted Entities:")
for entity in entities:
    print(f"  {entity['word']}: {entity['entity_group']} (confidence: {entity['score']:.4f})")

# Additional example with more complex entities
print("\n" + "="*50)
text2 = "I need to cancel my reservation at the Hilton Hotel in London for tomorrow at 7 PM."
entities2 = ner_pipeline(text2)

print("User Input:", text2)
print("\nExtracted Entities:")
for entity in entities2:
    print(f"  {entity['word']}: {entity['entity_group']} (confidence: {entity['score']:.4f})")

## Combined CLU Example

Let's combine both intent detection and entity extraction to create a complete conversational language understanding system.

In [None]:
def analyze_user_input(text, intent_labels=None):
    """
    Complete CLU analysis combining intent detection and entity extraction
    """
    if intent_labels is None:
        intent_labels = ['book_flight', 'check_weather', 'play_music', 'order_food', 
                        'cancel_booking', 'get_directions', 'make_reservation']
    
    # Intent Detection
    intent_result = intent_classifier(text, intent_labels)
    
    # Entity Extraction
    entities = ner_pipeline(text)
    
    # Format results
    analysis = {
        'input': text,
        'intent': {
            'predicted': intent_result['labels'][0],
            'confidence': intent_result['scores'][0],
            'all_scores': dict(zip(intent_result['labels'], intent_result['scores']))
        },
        'entities': [
            {
                'text': entity['word'],
                'label': entity['entity_group'],
                'confidence': entity['score']
            }
            for entity in entities
        ]
    }
    
    return analysis

# Test the combined function
test_utterances = [
    "I want to book a flight from New York to Tokyo on December 15th",
    "What's the weather like in San Francisco today?",
    "Play some jazz music by Miles Davis",
    "Cancel my dinner reservation at Le Bernardin for tonight"
]

for utterance in test_utterances:
    print("="*60)
    result = analyze_user_input(utterance)
    
    print(f"Input: {result['input']}")
    print(f"Intent: {result['intent']['predicted']} (confidence: {result['intent']['confidence']:.4f})")
    print("Entities:")
    for entity in result['entities']:
        print(f"  - {entity['text']}: {entity['label']} ({entity['confidence']:.4f})")
    print()

## Interactive Experimentation

Try modifying the text and candidate labels below to experiment with different CLU scenarios. You can add your own intents and test various user utterances.

In [None]:
# Experiment with your own examples
# Modify these variables to test different scenarios

# Your custom intent labels
custom_intents = [
    'book_flight', 'check_weather', 'play_music', 'order_food',
    'make_reservation', 'cancel_booking', 'get_directions', 'set_reminder'
]

# Your test utterance
user_input = "Remind me to call my doctor tomorrow at 3 PM"

# Analyze the input
result = analyze_user_input(user_input, custom_intents)

print("🎯 CLU Analysis Results")
print("="*40)
print(f"📝 Input: {result['input']}")
print(f"🎯 Intent: {result['intent']['predicted']}")
print(f"📊 Confidence: {result['intent']['confidence']:.4f}")
print("\n🏷️  Entities Found:")
if result['entities']:
    for entity in result['entities']:
        print(f"   • {entity['text']} → {entity['label']} ({entity['confidence']:.4f})")
else:
    print("   • No entities detected")

print("\n📈 All Intent Scores:")
for intent, score in list(result['intent']['all_scores'].items())[:5]:  # Top 5
    print(f"   • {intent}: {score:.4f}")