Google Colab: https://colab.research.google.com/drive/1o0JFJAF831rQW6FYIG3-3t3w-xmJqpGf?usp=sharing

HuggingFace: https://huggingface.co/deepset/roberta-base-squad2

## Libraries Explained

- **dotenv**: Loads environment variables from a `.env` file into the application's environment, helping manage configuration separately from code.

- **huggingface_hub**: 
  - **HfApi**: Provides programmatic access to the Hugging Face model hub for uploading, downloading, and managing models.
  - **hf_hub_download**: Simplifies downloading model files from the Hugging Face hub to your local environment.

- **transformers**: Offers pre-trained models for natural language processing tasks. The `pipeline` function specifically provides an easy-to-use interface for common NLP tasks like text generation, sentiment analysis, and question answering.


In [1]:
import os, json, datetime
from datetime import datetime
from dotenv import load_dotenv

from huggingface_hub import HfApi
from huggingface_hub import hf_hub_download



from transformers import pipeline


# Loading Environment Variables for Hugging Face


This code snippet performs two essential operations:

1. `load_dotenv()` - Loads environment variables from a `.env` file into the application's environment. This is a common pattern for securely storing configuration and sensitive information outside of the source code.

2. `hf_key = os.getenv("HF_TOKEN")` - Retrieves the Hugging Face API token from the environment variables and assigns it to the variable `hf_key`. This token is required for authenticated access to the Hugging Face Hub services, including downloading private models or models with gated access.


In [2]:
load_dotenv()
hf_key=os.getenv("HF_TOKEN")


# Hugging Face Model Reference

[deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2)


## Model Overview

This reference points to a RoBERTa model fine-tuned specifically for question answering tasks using the SQuAD 2.0 dataset.

## Key Specifications

- **Base Architecture**: RoBERTa (Robustly Optimized BERT Approach)
- **Model Size**: Base variant (125M parameters)
- **Developer**: deepset AI
- **Fine-tuning**: SQuAD 2.0 (Stanford Question Answering Dataset)
- **Task**: Extractive question answering

## Model Capabilities

- Answers questions by locating relevant spans within provided context
- Handles unanswerable questions (a key feature of SQuAD 2.0)
- Extracts precise answers rather than generating them
- Works best with factual questions about information present in text

## Technical Details

- Input: Question and context passages
- Output: Start and end positions of answer span within context
- Optimization: Trained on both answerable and unanswerable questions
- Language: English


This model is particularly useful for applications requiring information extraction from documents, search engines, and question-answering systems.

In [3]:
hf_reference='deepset/roberta-base-squad2'


# Downloading Specific Model Files from Hugging Face Hub


This code snippet demonstrates how to selectively download specific files from a Hugging Face model repository:

1. **File Definition**: First, a list of commonly required files for transformer models is defined, with comments explaining each file's purpose:
   - Vocabulary files for tokenization
   - Configuration files for model architecture
   - Tokenizer files for text preprocessing
   - Model weights in different formats (PyTorch and SafeTensors)

2. **Selective Download**: The code iterates through each file in the list and:
   - Attempts to download it using `hf_hub_download()`
   - Specifies the model repository via `repo_id=hf_reference`
   - Saves files to a local directory structure based on the model name
   - Prints the local path where each file is saved

3. **Error Handling**: The try-except block catches and reports any download failures, allowing the process to continue even if certain files aren't available for the specific model.


In [None]:
# List of required files
required_files = [
    "vocab.txt",          # Vocabulary file (if applicable)
    "vocab.json",          # Vocabulary file (if applicable)       
    "config.json",        # Model configuration
    "tokenizer.json",     # Tokenizer configuration (if applicable)
    "merges.txt",         # BPE merge rules file (if applicable)
    "pytorch_model.bin",  # Model weights
    "model.safetensors",  # Alternative model weights format
]


# Download only the required files
for file_name in required_files:
    try:
        print()
        print(f"Attempting to download: {file_name}")
        local_path = hf_hub_download(repo_id=hf_reference, filename=file_name, local_dir=f"models/{hf_reference.split('/')[1]}")
        print(f"Saved to: {local_path}")
    except Exception as e:
        print(f"Could not download {file_name}: {e}")
        
    


# Setting Up Question Answering Pipelines

This code initializes two identical question answering pipelines using different model sources:

### Remote Model (Cached)
- `hf_model_cache`: Uses the model directly from Hugging Face's model hub
  - References `hf_reference` ("deepset/roberta-base-squad2")
  - Downloads and caches the model on first use
  - Automatically fetches the latest version
  - Requires internet connection initially

### Local Model
- `hf_model_local`: Uses a locally saved version of the same model
  - Path: `"models/roberta-base-squad2"`
  - Extracts model name from the reference using string splitting
  - Assumes the model has been previously downloaded to the local directory
  - Works offline after download

## Pipeline Functionality

Both pipelines provide:
- Extractive question answering capabilities
- Answer span identification in provided context
- Confidence scores for predicted answers
- Handling of unanswerable questions

## Usage Considerations

- **Remote/Cached**: Best for initial development or when storage is limited
- **Local**: Preferred for production, offline usage, or repeated access
- Both versions offer identical functionality and performance
- The local version avoids repeated downloads and potential API rate limits

This dual setup provides flexibility while ensuring consistent behavior across different deployment scenarios.

In [4]:
hf_model_cache = pipeline("question-answering", model=hf_reference)
# hf_model_local = pipeline("question-answering", model=f"models/{hf_reference.split('/')[1]}")

Device set to use mps:0


# Extracting Customer Responsibilities with Question Answering

This example demonstrates extractive question answering on a legal disclaimer text:

### Input Components
- **Query**: "what are customer's responsibilities"
  - Seeks to identify obligations or responsibilities mentioned in the text
  - Phrased as a straightforward question

- **Context**: NVIDIA legal disclaimer text
  - Contains information about limitations of liability
  - Defines what NVIDIA is and isn't responsible for
  - Typical boilerplate legal language

### Question Answering Process
- **Model**: deepset/roberta-base-squad2 (referenced through hf_model_cache)
- **Parameters**:
  - `top_k=3`: Returns the 3 most likely answers instead of just the best one
  - This provides alternative interpretations of what might constitute "responsibilities"

### Output Format
The code will print three different potential answers, each containing:
- The extracted answer text span
- Start and end positions within the context
- Confidence score for each answer
- Separator lines between results

Since the text doesn't explicitly list customer responsibilities but rather NVIDIA's non-responsibilities, the model will likely extract phrases related to liability, use of information, or patent infringement as possible answers.

In [6]:
query = "what are customer's responsibilities"
text = '''This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. NVIDIA
Corporation (“NVIDIA”) makes no representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and
assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents
or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or
functionality.'''

# Get answers from different models
results = hf_model_cache(question=query, context=text, top_k=3)

for result in results:
  print(result)
  print("*****************")
  

{'score': 0.0008512335480190814, 'start': 371, 'end': 394, 'answer': 'errors contained herein'}
*****************
{'score': 0.0005820643855258822, 'start': 367, 'end': 394, 'answer': 'any errors contained herein'}
*****************
{'score': 0.0005411853780969977, 'start': 345, 'end': 394, 'answer': 'no responsibility for any errors contained herein'}
*****************



# Serialize and Save Model Information from Hugging Face Hub


This code demonstrates how to retrieve, serialize, and save detailed model information from the Hugging Face Hub:

1. **Serialization Function**: The `serialize_object()` function handles complex objects recursively:
   - Converts datetime objects to ISO format strings
   - Transforms objects with `__dict__` attributes into dictionaries
   - Processes nested lists and dictionaries
   - Preserves primitive data types

2. **API Interaction**: Creates an instance of the Hugging Face API client

3. **Model Information**: Fetches comprehensive metadata about the specified model using `api.model_info()`

4. **File Operations**: 
   - Extracts the model name from the reference path
   - Creates a JSON file named after the model
   - Serializes the model information and writes it to the file

This allows for local storage of model metadata for later reference or analysis, particularly useful for model governance, versioning, and documentation purposes.


In [None]:
def serialize_object(obj):
    """
    Helper function to serialize custom objects like EvalResult.
    Converts objects with __dict__ attribute to dictionaries and handles datetime objects.
    """
    if isinstance(obj, datetime):
        return obj.isoformat()  # Convert datetime to ISO 8601 string
    elif hasattr(obj, "__dict__"):
        return {key: serialize_object(value) for key, value in obj.__dict__.items()}
    elif isinstance(obj, list):
        return [serialize_object(item) for item in obj]
    elif isinstance(obj, dict):
        return {key: serialize_object(value) for key, value in obj.items()}
    else:
        return obj  # Return the value as-is for primitive types

api = HfApi()
with open(f"models/{hf_reference.split('/')[1]}.json", "w") as json_file:
    json_file.write(json.dumps(serialize_object(api.model_info(hf_reference))))
