In [1]:
# Import lib for HTML requests:
import requests
# Import Huggingface lib for pipeline of Natural Language Processing (NLP) tasks:
from transformers import pipeline

In [2]:
# Function to perform Named Entity Recognition (NER) on user input text
def get_named_entities(text):
    # Load the NER pipeline proviced by Huggingface:
    ner_pipeline = pipeline("ner", aggregation_strategy="simple")

    # Perform NER on the string
    entities = ner_pipeline(text)
    
    # Create and return array with just the text of the recognized entities (w/o additional meta data)
    named_entities = []
    for entity in entities:
        named_entities.append(entity['word'])
    return named_entities

In [3]:
# Function to call DBpedia given a (named) entity
def get_dbpedia_entry(entity):
    
    # SPARQL query template:
    query = f"""
    SELECT ?abstract WHERE {{
        ?entity rdfs:label "{entity}"@en .
        ?entity dbo:abstract ?abstract .
        FILTER (lang(?abstract) = 'en')
    }} LIMIT 1
    """
    # SPARQL endpoint URL:
    url = "https://dbpedia.org/sparql"
    
    # HTML header:
    headers = {
        "Accept": "application/sparql-results+json"
    }
    
    # Perform SPARQL query / HTML call:
    response = requests.get(url, headers=headers, params={"query": query})
    data = response.json()
    
    # Check if dbpedia response includes entries and if yes return dbpedia entry text:
    if data["results"]["bindings"]:
        return data["results"]["bindings"][0]["abstract"]["value"]
    else:
        return None

In [4]:
# Default large language model trained for question answering:
def llm_agent(prompt, ctxt):
    qa_pipeline = pipeline("question-answering")
    return qa_pipeline(question=prompt, context=ctxt)['answer']

In [5]:
# Example usage
question = "Who is Barack Obama?"

#Perform Named Entity Recognition to extract entities for which we can generate context by calling dbpedia:
named_entities = get_named_entities(question)
print(named_entities)

#For this demo purpose, let's use the first entity, only:
entity = named_entities[0]
entry = get_dbpedia_entry(entity)

print("prompt",question)
print("context",entry)

#Generate response from LLM given the dbpedia entry as context:
print(llm_agent(question,entry))

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


['Barack Obama']
prompt Who is Barack Obama?
context Barack Hussein Obama II (/bəˈrɑːk huːˈseɪn oʊˈbɑːmə/ bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, Obama was the first African-American president of the United States. He previously served as a U.S. senator from Illinois from 2005 to 2008 and as an Illinois state senator from 1997 to 2004, and previously worked as a civil rights lawyer before entering politics. Obama was born in Honolulu, Hawaii. After graduating from Columbia University in 1983, he worked as a community organizer in Chicago. In 1988, he enrolled in Harvard Law School, where he was the first black president of the Harvard Law Review. After graduating, he became a civil rights attorney and an academic, teaching constitutional law at the University of Chicago Law School from 1992 to 2004. Turning to elective politics, he represented th

config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

an American politician
