# S2QA with Ollama - Llama 2


## Setup

First, follow the [readme](https://github.com/jmorganca/ollama#building) to set up and run a local Ollama instance.

This demo is running -

```bash
./ollama run llama2:13b-chat
```

When the Ollama app is running on your local machine:

- All of your local models are automatically served on localhost:11434
- Select your model when setting llm = Ollama(..., model="<model family>:<version>")
- If you set llm = Ollama(..., model="<model family") without a version it will simply look for latest


In [2]:
from llama_index.llms.ollama import Ollama
from llama_index.query_engine import CitationQueryEngine
from llama_index import (
    VectorStoreIndex,
    ServiceContext,
)
from llama_index.response.notebook_utils import display_response
from llama_hub.semanticscholar.base import SemanticScholarReader

In [3]:
llm = Ollama(model="llama2")

s2reader = SemanticScholarReader()

# narrow down the search space
query_space = "biases in large language models"

# increase limit to get more documents
documents = s2reader.load_data(query=query_space, limit=10)

service_context = ServiceContext.from_defaults(llm=llm)
index = VectorStoreIndex.from_documents(documents, service_context=service_context)

query_engine = CitationQueryEngine.from_args(
    index,
    similarity_top_k=3,
    citation_chunk_size=512,
)

# query the index
query_string = "explain all the biases in large language models in a markdown table"
# query the citation query engine
response = query_engine.query(query_string)
display_response(
    response, show_source=True, source_length=100, show_source_metadata=True
)

**`Final Response:`** Sure! Here's a markdown table summarizing the biases in large language models discussed in the provided sources:

| Bias | Source | Description |
| --- | --- | --- |
| Data selection bias | Source 1, 2 | Bias caused by the choice of texts that make up a training corpus. |
| Social bias | Source 1, 2 | Bias in the text generated by language models trained on such corpora, ranging from gender to age, from sexual orientation to ethnicity, and from religion to culture. |
| Unintentional consequences of biased model outputs | Source 2 | Arising from the nature of training data, model specifications, algorithmic constraints, product design, and policy decisions. |
| Ethical concerns | Source 2 | Bias can lead to unethical or harmful outcomes, such as perpetuating stereotypes or discrimination. |
| Opportunities to mitigate biases | Source 3 | Identifying qualitative categories of erroneous behavior beyond identifying individual errors, drawing inspiration from human cognitive biases. |

Please note that this table is not an exhaustive list of all possible biases in large language models, but rather a summary of the biases discussed in the provided sources.

---

**`Source Node 1/3`**

**Node ID:** cd65f0c1-5086-4473-ae53-1dc74636759d<br>**Similarity:** 0.8526345658029143<br>**Text:** Source 1:
Biases in Large Language Models: Origins, Inventory, and Discussion In this article, we...<br>**Metadata:** {'title': 'Biases in Large Language Models: Origins, Inventory, and Discussion', 'venue': 'ACM Journal of Data and Information Quality', 'year': 2023, 'paperId': '6d0656d9bb60a2bea50c4b894fbcc5d1e32134e7', 'citationCount': 4, 'openAccessPdf': 'https://dl.acm.org/doi/pdf/10.1145/3597307', 'authors': ['Roberto Navigli', 'Simone Conia', 'Björn Ross'], 'externalIds': {'DBLP': 'journals/jdiq/NavigliCR23', 'DOI': '10.1145/3597307', 'CorpusId': 258688053}}<br>

---

**`Source Node 2/3`**

**Node ID:** 9d3b8013-7109-4653-8cab-afd1fa92fa38<br>**Similarity:** 0.833325174947303<br>**Text:** Source 2:
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models As the ...<br>**Metadata:** {'title': 'Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models', 'venue': 'arXiv.org', 'year': 2023, 'paperId': '16d83e930a4dab2d49f5d276838ddce79df3f787', 'citationCount': 31, 'openAccessPdf': None, 'authors': ['Emilio Ferrara'], 'externalIds': {'DBLP': 'journals/corr/abs-2304-03738', 'ArXiv': '2304.03738', 'DOI': '10.48550/arXiv.2304.03738', 'CorpusId': 258041203}}<br>

---

**`Source Node 3/3`**

**Node ID:** b9a14e40-14ed-476b-a89c-6b88696c2168<br>**Similarity:** 0.8317752061345524<br>**Text:** Source 3:
Capturing Failures of Large Language Models via Human Cognitive Biases Large language m...<br>**Metadata:** {'title': 'Capturing Failures of Large Language Models via Human Cognitive Biases', 'venue': 'Neural Information Processing Systems', 'year': 2022, 'paperId': '76f023c3a819fc58989a064a1b50825b11fce95d', 'citationCount': 29, 'openAccessPdf': None, 'authors': ['Erik Jones', 'J. Steinhardt'], 'externalIds': {'DBLP': 'journals/corr/abs-2202-12299', 'ArXiv': '2202.12299', 'CorpusId': 247084098}}<br>