# Travel Destination Extractor

## Objective
This project extracts travel destinations (locations such as cities, countries, and landmarks)
from travel-related text using Named Entity Recognition (NER) with Hugging Face pipelines.


In [None]:
!pip install -q transformers torch

In [None]:
from transformers import pipeline


In [None]:
ner_pipeline = pipeline(
    "ner",
    model="dbmdz/bert-large-cased-finetuned-conll03-english",
    aggregation_strategy="simple"
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cuda:0


In [None]:
travel_text = """
Last summer I visited Paris and Rome before heading to the Swiss Alps near Zurich.
After that, I flew to New York and later relaxed on the beaches of Bali.
"""

In [None]:
entities = ner_pipeline(travel_text)

for entity in entities:
    print(entity)

{'entity_group': 'LOC', 'score': np.float32(0.9996213), 'word': 'Paris', 'start': 23, 'end': 28}
{'entity_group': 'LOC', 'score': np.float32(0.99957436), 'word': 'Rome', 'start': 33, 'end': 37}
{'entity_group': 'LOC', 'score': np.float32(0.7993264), 'word': 'Swiss Alps', 'start': 60, 'end': 70}
{'entity_group': 'LOC', 'score': np.float32(0.99964654), 'word': 'Zurich', 'start': 76, 'end': 82}
{'entity_group': 'LOC', 'score': np.float32(0.99947214), 'word': 'New York', 'start': 106, 'end': 114}
{'entity_group': 'LOC', 'score': np.float32(0.99922574), 'word': 'Bali', 'start': 151, 'end': 155}


In [None]:
destinations = []

for entity in entities:
    if entity["entity_group"] in ["LOC", "GPE"]:
        destinations.append(entity["word"])

print("Extracted Travel Destinations:")
for place in set(destinations):
    print("-", place)


Extracted Travel Destinations:
- Zurich
- New York
- Swiss Alps
- Paris
- Bali
- Rome


## Explanation

This project uses a transformer-based Named Entity Recognition (NER) pipeline to identify
location entities from unstructured travel text.

The underlying model is a BERT-based encoder fine-tuned for NER tasks, which makes it suitable
for extracting place names such as cities, countries, and landmarks. This demonstrates how
pre-trained NLP models can be applied to real-world information extraction problems without
additional training.
