<a href="https://colab.research.google.com/github/HimanshuGitCode/100Days-OpenSource-Library/blob/main/Giskard_Evaluation_%26_Testing_Framework_for_AI_Systems.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://drive.google.com/uc?export=view&id=1wYSMgJtARFdvTt5g7E20mE4NmwUFUuog" width="200">

[![Build Fast with AI](https://img.shields.io/badge/BuildFastWithAI-GenAI%20Bootcamp-blue?style=for-the-badge&logo=artificial-intelligence)](https://www.buildfastwithai.com/genai-course)
[![EduChain GitHub](https://img.shields.io/github/stars/satvik314/educhain?style=for-the-badge&logo=github&color=gold)](https://github.com/satvik314/educhain)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1frZnKy1nMx3GmU9bSSdVqRUD0GRqoAyV?usp=sharing)
## Master Generative AI in 6 Weeks
**What You'll Learn:**
- Build with Latest LLMs
- Create Custom AI Apps
- Learn from Industry Experts
- Join Innovation Community
Transform your AI ideas into reality through hands-on projects and expert mentorship.
[Start Your Journey](https://www.buildfastwithai.com/genai-course)
*Empowering the Next Generation of AI Innovators

### **Giskard: Evaluation & Testing Framework for AI Systems**

Giskard is an open-source Python library for evaluating and testing AI systems.  
It detects performance, bias, and security issues in AI models.  
Supports integrations with tools like Hugging Face, TensorFlow, and MLFlow.  
Helps ensure quality, security, and compliance of AI systems.  
```

###**Setup and Installation**

In [1]:
%pip install "giskard[llm]" langchain langchain-openai langchain-community pypdf faiss-cpu openai tiktoken


Collecting langchain-openai
  Downloading langchain_openai-0.2.14-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.14-py3-none-any.whl.metadata (2.9 kB)
Collecting pypdf
  Downloading pypdf-5.1.0-py3-none-any.whl.metadata (7.2 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.9.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.4 kB)
Collecting tiktoken
  Downloading tiktoken-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting giskard[llm]
  Downloading giskard-2.16.0-py3-none-any.whl.metadata (15 kB)
Collecting zstandard>=0.10.0 (from giskard[llm])
  Downloading zstandard-0.23.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting mlflow-skinny>=2 (from giskard[llm])
  Downloading mlflow_skinny-2.19.0-py3-none-any.whl.metadata (31 kB)
Collecting scipy<1.12.0,>=1.7.3 (from giskard[llm])
  Downloading scipy-1.11.4-cp310-cp310-ma

###**Setup OpenAI Key**

In [3]:
import os

from google.colab import userdata
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

###**Create a model with LangChain**

In [4]:
from langchain import FAISS, PromptTemplate
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain.document_loaders import PyPDFLoader
from langchain.chains import RetrievalQA
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Prepare vector store (FAISS) with IPPC report
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100, add_start_index=True)
loader = PyPDFLoader("https://www.ipcc.ch/report/ar6/syr/downloads/report/IPCC_AR6_SYR_LongerReport.pdf")
db = FAISS.from_documents(loader.load_and_split(text_splitter), OpenAIEmbeddings())




###**Define Prompt Template for QA Chain**

In [5]:
PROMPT_TEMPLATE = """You are the Climate Assistant, a helpful AI assistant made by Giskard.
Your task is to answer common questions on climate change.
You will be given a question and relevant excerpts from the IPCC Climate Change Synthesis Report (2023).
Please provide short and clear answers based on the provided context. Be polite and helpful.

Context:
{context}

Question:
{question}

Your answer:
"""

###**Prepare QA chain**

In [6]:
llm = OpenAI(model="gpt-4o", temperature=0)
prompt = PromptTemplate(template=PROMPT_TEMPLATE, input_variables=["question", "context"])
climate_qa_chain = RetrievalQA.from_llm(llm=llm, retriever=db.as_retriever(), prompt=prompt)


###**Test QA Chain Functionality**

In [7]:
climate_qa_chain.invoke({"query": "Is sea level rise avoidable? When will it stop?"})

NotFoundError: Error code: 404 - {'error': {'message': 'This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?', 'type': 'invalid_request_error', 'param': 'model', 'code': None}}

###**Wrap model and dataset with Giskard**

In [8]:
!pip install backports.strenum griffe==0.48.0




In [4]:
# Install necessary packages
# Avoid force reinstall unless absolutely necessary
!pip install backports.strenum
!pip install giskard
!pip install griffe[enum34]

# Restart runtime note
print(
    "NOTE: If you see a warning about previously imported packages, "
    "restart the runtime/kernel and rerun this code."
)

# Import required libraries
import giskard
import pandas as pd
import importlib

# Reload modules if required (after installation in the same runtime)
importlib.reload(giskard)

# Placeholder for the LLM model. Replace this with your actual model instance or initialization
# For example:
# from some_module import climate_qa_chain
# climate_qa_chain = YourModelHere()

def model_predict(df: pd.DataFrame):
    """Wraps the LLM call in a simple Python function.

    Args:
        df (pd.DataFrame): Input DataFrame containing a column "question".

    Returns:
        list: A list of outputs, one for each row in the input DataFrame.
    """
    # Ensure the DataFrame contains the expected column
    if "question" not in df.columns:
        raise ValueError("The input DataFrame must contain a 'question' column.")

    # Replace 'climate_qa_chain' with your actual model or logic
    return [climate_qa_chain.invoke({"query": question}) for question in df["question"]]

# Create a Giskard model
# Replace 'climate_qa_chain.invoke' with actual callable logic
giskard_model = giskard.Model(
    model=model_predict,
    model_type="text_generation",
    name="Climate Change Question Answering",
    description="This model answers any question about climate change based on IPCC reports",
    feature_names=["question"],
)

# Optional: Test the model (Uncomment if a test DataFrame and climate_qa_chain are available)
# test_df = pd.DataFrame({"question": ["What are the causes of climate change?"]})
# predictions = model_predict(test_df)
# print(predictions)




INFO:giskard.models.automodel:Your 'prediction_function' is successfully wrapped by Giskard's 'PredictionFunctionModel' wrapper class.




###**Test the wrapped model**

In [5]:
examples = [
    "According to the IPCC report, what are key risks in the Europe?",
    "Is sea level rise avoidable? When will it stop?",
]
giskard_dataset = giskard.Dataset(pd.DataFrame({"question": examples}), target=None)

print(giskard_model.predict(giskard_dataset).prediction)

INFO:giskard.datasets.base:Your 'pandas.DataFrame' is successfully wrapped by Giskard's 'Dataset' wrapper class.
INFO:giskard.datasets.base:Casting dataframe columns from {'question': 'object'} to {'question': 'object'}


NameError: name 'climate_qa_chain' is not defined

###**Scan your model for vulnerabilities with Giskard**

In [None]:
report = giskard.scan(giskard_model, giskard_dataset, only="hallucination")


In [None]:
display(report)


###**Running the whole scan**

In [None]:
full_report = giskard.scan(giskard_model, giskard_dataset)


###**Save it to a file**

In [None]:

display(full_report)

full_report.to_html("scan_report.html")

###**Generate test suites from the scan**

In [None]:
test_suite = full_report.generate_test_suite(name="Test suite generated by scan")
test_suite.run()