In [None]:
from transformers import pipeline

# Load QA pipeline
qa_pipeline = pipeline("question-answering", model="distilbert-base-uncased-distilled-squad")

# Context
context = """
Natural Language Processing (NLP) is a subfield of deep learning that enables machines to understand and generate human language. 
It has two main categories: Natural Language Understanding and Natural Language Generation. Applications include text processing, conversational systems, sentiment analysis, information extraction, and speech applications.

The general process starts with data collection and storage, followed by preprocessing. Preprocessing steps include tokenization (splitting text into smaller units), lowercasing and removing stop words, lemmatization (mapping words to their base form, more accurate but computationally heavier), and stemming (rule-based truncation to stems, faster but less accurate). These steps are known as text normalization.

Text representation methods convert words into numerical formats. Traditional count-based methods include:
- One-hot encoding: represents each word as a binary vector, simple but memory-intensive and does not capture semantics.
- Bag of Words (BoW): represents documents by word frequencies. Easy to implement but ignores context and order.
- TF-IDF: balances word frequency within a document with how rare the word is across documents. Useful for classification and keyword extraction, but still lacks semantic understanding.

Neural embeddings capture meaning and context:
- Word2Vec: CBOW predicts a word from context, fast but less accurate for rare words. Skip-Gram predicts context words from a target word, better for rare words but slower.
- GloVe: uses global co-occurrence statistics to create vectors; for example, king - man + woman ≈ queen.
- FastText: breaks words into subword units, allowing embeddings for unseen words or misspellings.

Contextual embeddings like BERT use transformers to assign relevance to each word in a sentence, producing context-dependent word meanings (e.g., 'bank' in 'river bank' vs 'bank account').
"""



while True:
    q = input("Question (type 'exit' to quit): ")
    if q.lower() == "exit":
        break
    if q.endswith("?"):
        print(f"Question: {q}")
        print(f"Answer: {qa_pipeline(question=q, context=context)['answer']}") # type: ignore
    else:
        print("Please ask a valid question ending with '?'")
