## Using SQuAD dataset with LLM setup locally
The model initially used is Mistral Instruct with 7 billion parameters which is compressed with Quantization 4

In [4]:
import openai
import pandas as pd
from datasets import load_dataset
import os

In [9]:
openai.api_base = "http://localhost:1234/v1"
openai.api_key = ""

In [15]:
# Testing the local Model 
test = openai.ChatCompletion.create(model="local-model", messages=[
    {"role": "system", "content": "You are Jay"},
    {"role": "user", "content": "Introduce yourself."}
  ])

In [19]:
print(test.choices[0].message)

{
  "role": "assistant",
  "content": "Hello, I am Jay. Nice to meet you! How can I help you today?"
}


In [20]:
dataset = load_dataset("squad")

In [22]:
dataset.shape

{'train': (87599, 5), 'validation': (10570, 5)}

In [23]:
df = pd.DataFrame(dataset['train']).sample(n=10)

In [24]:
df

Unnamed: 0,id,title,context,question,answers
9439,57336480d058e614000b5a00,Saint_Barth%C3%A9lemy,"Saint Barthélemy, a volcanic island fully enci...",What is the capital of St. Barts?,"{'text': ['Gustavia'], 'answer_start': [181]}"
25899,5706ee379e06ca38007e921d,Letter_case,The convention followed by many British publis...,What is an alternative name for sentence-style...,"{'text': ['sentence case'], 'answer_start': [3..."
35414,57112c66a58dae1900cd6cf2,Nintendo_Entertainment_System,A thriving market of unlicensed NES hardware c...,What was the name of the NES clone produced in...,"{'text': ['Dendy'], 'answer_start': [238]}"
57517,5727c5ccff5b5019007d94d7,Gramophone_record,The development of quadraphonic records was an...,What did developments in quadraphonic recordin...,"{'text': ['later surround-sound systems'], 'an..."
70749,572ebe3c03f98919007569de,Muammar_Gaddafi,The Bovington signal course's director reporte...,"When Gaddafi returned to Libya, how did he vie...",{'text': ['while he travelled to England belie...
81716,5730f09fa5e9cc1400cdbb13,Russian_language,Among the first to study Russian dialects was ...,Who made the first dialectal Russian dictionary?,"{'text': ['Vladimir Dal'], 'answer_start': [90]}"
12278,56dfb587231d4119001abca2,Pub,"Historically, pubs have been socially and cult...",What are the windows of 1990s and later pubs o...,"{'text': ['clear glass'], 'answer_start': [350]}"
25372,5706e36d9e06ca38007e91e4,Immunology,Maternal factors also play a role in the body’...,"At 6 to 9 months, an infant's immune system be...","{'text': ['glycoproteins'], 'answer_start': [1..."
25371,5706e36d9e06ca38007e91e3,Immunology,Maternal factors also play a role in the body’...,For how long do these antibodies have an effec...,"{'text': ['up to 18 months'], 'answer_start': ..."
22096,56f903a79e9bad19000a07d3,Near_East,The Ministry of Foreign Affairs of the Republi...,"The Middle East, the Balkans and others are in...","{'text': ['Republic of Turkey'], 'answer_start..."


In [35]:
def getAnswerFromModel(row):
    question = row['question']
    context = row['context']
    
    response = openai.ChatCompletion.create(model="local-model", messages=[
        {"role": "system", "content": "Perform extractive Question Answering from the context. The answer should be in an array containing the answer and the start character index of the answer in format of {{'text': [], 'answer_start': []}}."},
        {"role": "user", "content": f"Context: {context}\n\nQuestion: {question}\nAnswer:?"}
    ])
    
    return response.choices[0].message.content

In [26]:
# testing a single row before trying out on the whole dataframe
testAnswer = getAnswerFromModel(df.loc[9439])

In [29]:
testAnswer

<OpenAIObject at 0x16dd2c650> JSON: {
  "role": "assistant",
  "content": "[{'text': ['Gustavia'], 'answer_start': [15]}]"
}

In [36]:
df["Local Mistral Instruct 7B Q4_K_S"] = df.apply(getAnswerFromModel, axis=1)

In [37]:
df

Unnamed: 0,id,title,context,question,answers,Local Mistral Instruct 7B Q4_K_S
9439,57336480d058e614000b5a00,Saint_Barth%C3%A9lemy,"Saint Barthélemy, a volcanic island fully enci...",What is the capital of St. Barts?,"{'text': ['Gustavia'], 'answer_start': [181]}","[{'text': ['Gustavia'], 'answer_start': [14]}]"
25899,5706ee379e06ca38007e921d,Letter_case,The convention followed by many British publis...,What is an alternative name for sentence-style...,"{'text': ['sentence case'], 'answer_start': [3...","[{'text': ['sentence case'], 'answer_start': [..."
35414,57112c66a58dae1900cd6cf2,Nintendo_Entertainment_System,A thriving market of unlicensed NES hardware c...,What was the name of the NES clone produced in...,"{'text': ['Dendy'], 'answer_start': [238]}","[{'text': ['Dendy'], 'answer_start': [16, 18, ..."
57517,5727c5ccff5b5019007d94d7,Gramophone_record,The development of quadraphonic records was an...,What did developments in quadraphonic recordin...,"{'text': ['later surround-sound systems'], 'an...",[{'text': ['They inspired later surround-sound...
70749,572ebe3c03f98919007569de,Muammar_Gaddafi,The Bovington signal course's director reporte...,"When Gaddafi returned to Libya, how did he vie...",{'text': ['while he travelled to England belie...,[{'text': ['He returned home “more confident a...
81716,5730f09fa5e9cc1400cdbb13,Russian_language,Among the first to study Russian dialects was ...,Who made the first dialectal Russian dictionary?,"{'text': ['Vladimir Dal'], 'answer_start': [90]}","[{'text': ['Vladimir Dal'], 'answer_start': [1..."
12278,56dfb587231d4119001abca2,Pub,"Historically, pubs have been socially and cult...",What are the windows of 1990s and later pubs o...,"{'text': ['clear glass'], 'answer_start': [350]}","[{'text': ['clear glass'], 'answer_start': [16..."
25372,5706e36d9e06ca38007e91e4,Immunology,Maternal factors also play a role in the body’...,"At 6 to 9 months, an infant's immune system be...","{'text': ['glycoproteins'], 'answer_start': [1...",[[{'text': ['an infant''s immune system begins...
25371,5706e36d9e06ca38007e91e3,Immunology,Maternal factors also play a role in the body’...,For how long do these antibodies have an effec...,"{'text': ['up to 18 months'], 'answer_start': ...",[{'text': ['These passively-acquired antibodie...
22096,56f903a79e9bad19000a07d3,Near_East,The Ministry of Foreign Affairs of the Republi...,"The Middle East, the Balkans and others are in...","{'text': ['Republic of Turkey'], 'answer_start...",Answer: ['The Ministry of Foreign Affairs of t...


In [38]:
df.to_csv("Squad_Local_MistralInstruct7BQ4.csv", index=False)