
# Occupancy Query and Response Generation

This Jupyter Notebook demonstrates how to query an occupancy database and generate responses using an instruction-tuned language model. The workflow includes:

1. **Environment Setup**: Import necessary libraries, set up MongoDB connection, and configure model parameters.
2. **Model Loading**: Load a pre-trained language model for text generation.
3. **Database Query**: Define functions to query the occupancy database based on given criteria.
4. **Response Generation**: Generate responses using the instruction-tuned model based on the queried occupancy records.
5. **Example Usage**: Provide sample inputs and demonstrate the querying and response generation process.

This notebook is useful for generating natural language responses based on occupancy data, which can be applied in various scenarios such as reporting, analytics, and automated customer support.


In [1]:

from pymongo import MongoClient
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from datetime import datetime
import os 
from dotenv import load_dotenv
from unsloth import FastLanguageModel
import torch

import gc

gc.collect()
torch.cuda.empty_cache()


max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

load_dotenv()
# MongoDB Setup
# Connect to MongoDB (replace the connection string with your MongoDB credentials)
client = MongoClient(os.getenv("MONGO_URI"))  # Local MongoDB connection
db = client[os.getenv("DB")]
collection = db[os.getenv("COLLECTION")]

  warn(


🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


In [2]:
# Load a local model (using GPT-2 here, you can replace with a larger model if needed)
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = os.getenv("MODEL"), # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = HF_TOKEN
    )
FastLanguageModel.for_inference(model) 

local_llm = pipeline("text-generation", model=model, tokenizer=tokenizer)

  if response.this_update > now:
  if response.next_update and response.next_update < now:
  if value.next_update is None:
  value.this_update
  < value.next_update
  cached_value.next_update is not None
  and cached_value.next_update < value.next_update
  assert value.this_update is not None
  assert value.next_update is not None
  value.this_update
  < value.next_update


==((====))==  Unsloth 2024.8: Fast Llama patching. Transformers = 4.44.2.
   \\   /|    GPU: NVIDIA GeForce RTX 4050 Laptop GPU. Max memory: 5.997 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.4.0. CUDA = 8.9. CUDA Toolkit = 12.1.
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.27.post2. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth




ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom `device_map` to `from_pretrained`. Check https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu for more details. 

**query the database**

Loading the model

### generating a response

In [3]:
#  Function to generate a response from the model
def generate_instruction_response(prompt):
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids  # return_tensors="pt" is used to return PyTorch tensors
    outputs = model.generate(input_ids, max_length=128,) # max_length is the maximum number of tokens to generate
    return tokenizer.decode(outputs[0], skip_special_tokens=True) # skip_special_tokens=True is used to remove special tokens

### query the model

In [4]:
def get_occupancy_by_criteria(criteria):
    """
    Query the occupancy collection based on various criteria.
    
    Parameters:
    criteria (dict): A dictionary containing the query criteria.
    
    Returns:
    list: A list of records matching the criteria.
    """
    if 'time' in criteria:
        criteria['time'] = criteria['time'].strftime("%Y-%m-%dT%H:%M:%S")
        
    records = collection.find(criteria)
    return list(records)

In [5]:
def query_occupancy_with_instruction_model(criteria):
    """
    Queries occupancy records based on the given criteria and generates a response using an instruction-tuned model.
    Args:
        criteria (dict): A dictionary containing the criteria to filter occupancy records.
    Returns:
        str: A response generated by the instruction-tuned model based on the occupancy records.
             If no records are found, returns a message indicating no records were found.
    """
    # Get occupancy records based on the given criteria
    records = get_occupancy_by_criteria(criteria)
    
    if records:
        # Create a prompt for the instruction model
        prompt = f"What was the occupancy based on the given criteria?"
        # Generate a response using the instruction-tuned model
        response = generate_instruction_response(prompt + f" The occupancy records are {records}")
        return response
    else:
        return "No records found based on the given criteria."

### sample inputs

In [None]:
# Example Usage
date = datetime(2024, 9, 15,19,24,17)  # Replace with the desired date
result = query_occupancy_with_instruction_model({"time": date}) # Query occupancy records for the given date

print(result)