# Task 4: Named Entity Recognition (NER) with BIO Tagging

## Overview
In this task, you will implement a simple **Named Entity Recognition (NER)** tagger using the **BIO tagging scheme**. Each token in a sentence will be labeled as:

- B-LABEL → beginning of an entity
- I-LABEL → inside an entity
- O → outside any entity

You’ll work with **character-level entity spans**, align them with tokens using basic Python tools (not spaCy), and return token-level BIO tags.

In [1]:
def generate_ner_tags(text: str, entities: list[tuple[int, int, str]]) -> list[str]:
    text = text.split()

    # sort the entities, so that we begin from smallest available index first
    entities.sort()
    # default assign 'O' to all words
    res = ["O"]*len(text)

    # for each entity check if the word index lies in the given boundary
    for start, end, entity_name in entities:
        idx = 0
        # flag to keep track if the word is first word in an entity
        # to mark 'B-tag'
        is_first_word = True
        for word_idx in range(len(text)):
            word = text[word_idx]
            word_b = idx
            word_e = idx + len(word)
    
            if word_b < end and word_e > start:
                # once we find first word, flip the flag
                # as rest will be 'I-tag'
                if is_first_word:
                    res[word_idx] = "B-" + entity_name
                    is_first_word = False
                else:
                    res[word_idx] = "I-" + entity_name

            # modify idx to keep track of start index for next word
            idx += len(word) + 1
    
    return res

In [2]:
print(generate_ner_tags("Barack Obama was born in Hawaii.", [(0, 12, "PERSON"), (25, 31, "LOCATION")]))

['B-PERSON', 'I-PERSON', 'O', 'O', 'O', 'B-LOCATION']
