# TravelPurpose Quickstart

This notebook demonstrates the basic usage of the TravelPurpose library for city travel purpose classification.

## Installation

```bash
pip install travelpurpose
```

In [None]:
# Import the library
from travelpurpose import predict_purpose, tags, search, load
import json

## 1. Basic Purpose Prediction

Predict travel purposes for a city:

In [None]:
# Predict purposes for Istanbul
result = predict_purpose("Istanbul")
print(json.dumps(result, indent=2))

In [None]:
# Try different cities
cities = ["Paris", "Dubai", "Antalya", "Singapore", "Zurich"]

for city in cities:
    result = predict_purpose(city, use_cache=False)
    print(f"\n{city}:")
    print(f"  Main: {', '.join(result['main'])}")
    print(f"  Confidence: {result['confidence']}")

## 2. Exploring Raw Tags

View the raw tags harvested from various sources:

In [None]:
# Get tags for a city
city_tags = tags("Barcelona")

print(f"Total tags: {len(city_tags)}\n")

# Show first 10 tags
for tag in city_tags[:10]:
    print(f"{tag['tag']:20} | {tag['source']:12} | {tag.get('evidence_type', 'N/A')}")

## 3. Searching Cities

Search for cities in the dataset:

In [None]:
# Search for cities
results = search("tokyo")

for city in results[:5]:
    print(f"{city['name']}, {city['country']} - Pop: {city.get('population', 0):,}")

## 4. Understanding Confidence Scores

Confidence scores indicate how certain the classifier is about the predictions:

In [None]:
import pandas as pd

# Analyze confidence for multiple cities
test_cities = ["London", "Mecca", "Las Vegas", "Geneva", "Mumbai"]

results = []
for city in test_cities:
    pred = predict_purpose(city, use_cache=False)
    results.append({
        'City': city,
        'Main Categories': ', '.join(pred['main'][:3]),
        'Confidence': pred['confidence']
    })

df = pd.DataFrame(results)
df

## 5. Working with the Ontology

Explore the travel purpose ontology:

In [None]:
from travelpurpose.classifier import get_ontology

ontology = get_ontology()

print("Main Categories:")
for cat in ontology['main_categories']:
    print(f"  - {cat}")

print("\nSubcategories for 'Business':")
for sub in ontology['subcategories']['Business']:
    print(f"  - {sub}")

## 6. Batch Processing

Process multiple cities efficiently:

In [None]:
# Batch classify cities
cities_to_classify = [
    "Rome", "Bangkok", "New York", "Cairo", "Sydney",
    "Cape Town", "Tokyo", "Rio de Janeiro", "Moscow", "Seoul"
]

classifications = []
for city in cities_to_classify:
    result = predict_purpose(city, use_cache=True)
    classifications.append({
        'city': city,
        'primary_purpose': result['main'][0] if result['main'] else 'Unknown',
        'all_purposes': result['main'],
        'confidence': result['confidence']
    })

# Convert to DataFrame for analysis
df_cities = pd.DataFrame(classifications)
df_cities.sort_values('confidence', ascending=False)

## Next Steps

- Explore `02_training_and_rules.ipynb` for advanced usage
- Check the [documentation](https://github.com/teyfikoz/Travel_Purpose-City_Tags)
- Run the data pipeline to rebuild the dataset with fresh data