In [1]:
from biascheck.analysis.basecheck import BaseCheck
from langchain.vectorstores import FAISS
from datasets import load_dataset
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

# Load a publicly available dataset from Hugging Face
dataset = load_dataset("imdb", split="train[:20]")  # Load first 20 samples for speed
documents = [{"text": doc["text"]} for doc in dataset]

# Initialize embeddings and FAISS vector database
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vector_db = FAISS.from_texts([doc["text"] for doc in documents], embeddings)

# Initialize BaseCheck
checker = BaseCheck(
    data=vector_db,
    terms=["discrimination", "bias", "stereotypes"],
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    use_contextual_analysis=True,
)

# Perform analysis
results_df = checker.analyze(top_k=10)

# Generate and print report
report = checker.generate_report(results_df)
print(report)


  embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


Bias Analysis Report:

Text: "I Am Curious: Yellow" is a risible and pretentious steaming pile. It doesn't matter what one's political views are because this film can hardly be taken seriously on any level. As for the claim that frontal male nudity is an automatic NC-17, that isn't true. I've seen R-rated films with male nudity. Granted, they only offer some fleeting views, but where are the R-rated films with gaping vulvas and flapping labia? Nowhere, because they don't exist. The same goes for those crappy cable shows: schlongs swing
Similarity: 0.14
Sentiment: LABEL_0 (Score: 0.89)
Contextual Analysis Scores:
  This sentence promotes discrimination.: 0.14
  This sentence is fair and unbiased.: 0.02
  This sentence is offensive.: 0.26
Final Classification: This sentence is offensive.

Text: Who are these "They"- the actors? the filmmakers? Certainly couldn't be the audience- this is among the most air-puffed productions in existence. It's the kind of movie that looks like it was a lo

In [None]:
from datasets import load_dataset
from biascheck.analysis.basecheck import BaseCheck

# Load the "ag_news" dataset from Hugging Face
dataset = load_dataset("ag_news", split="train[:100]") 

# Prepare a graph-like structure this is just an example show on how to use the classes 

# We'll create a dictionary simulating nodes with "categories" and "text"
graph_data = []
for idx, record in enumerate(dataset):
    graph_data.append({
        "id": idx,
        "category": record["label"],
        "text": record["text"]
    })

# Simulate a graph database by iterating over graph_data
class SimulatedGraph:
    def run(self, query):
        if query == "MATCH (n) RETURN n":
            return [{"n": node} for node in graph_data]

# Initialize the simulated graph database
graph_db = SimulatedGraph()

# Initialize BaseCheck for the graph database
checker = BaseCheck(
    data=graph_db,
    terms=["discrimination", "bias", "stereotypes"],
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    use_contextual_analysis=True,
    verbose=True
)

# Perform analysis
results_df = checker.analyze()

# Generate and print report
report = checker.generate_report(results_df)
print(report)

# Save results to CSV for further inspection
results_df.to_csv("graph_database_bias_analysis.csv", index=False)


Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


Bias Analysis Report:

Text: Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindling\band of ultra-cynics, are seeing green again.
Similarity: 0.09
Sentiment: LABEL_1 (Score: 0.80)
Contextual Analysis Scores:
  This sentence promotes discrimination.: 0.01
  This sentence is fair and unbiased.: 0.01
  This sentence is offensive.: 0.01
Final Classification: This sentence is offensive.

Text: Carlyle Looks Toward Commercial Aerospace (Reuters) Reuters - Private investment firm Carlyle Group,\which has a reputation for making well-timed and occasionally\controversial plays in the defense industry, has quietly placed\its bets on another part of the market.
Similarity: 0.08
Sentiment: LABEL_1 (Score: 0.58)
Contextual Analysis Scores:
  This sentence promotes discrimination.: 0.01
  This sentence is fair and unbiased.: 0.00
  This sentence is offensive.: 0.01
Final Classification: This sentence is offensive.

Text: Oil and Economy Cloud Stocks' Outlo