# CMEMQueryBuilder

The CMEM query builder generates a SPARQL query based on a given ontology and a natural language question.

In [None]:
%pip install cmem-cmempy llama-index python-dotenv

Load environment from `.env` file. 
Start by `cp .env-template .env` and edit the content of `.env` accordingly.

In [1]:
%load_ext dotenv
%dotenv

Set up the LLM

In [2]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI

model="gpt-4o-mini"
llm = OpenAI(model=model)
Settings.llm = llm

Initialize the query builder with an ontology graph and a context (integration) graph.

In [3]:
from llama_index_cmem.utils.cmem_query_builder import CMEMQueryBuilder

ontology_graph = "http://ld.company.org/prod-vocab/"
context_graph = "http://ld.company.org/prod-inst/"

cmem_query_builder = CMEMQueryBuilder(ontology_graph=ontology_graph, context_graph=context_graph, llm=llm)

Now define your natural language question and let the query builder generate a SPARQL query.

In [4]:
question = "List all services. Limit the results to 20 items."

cmem_query = cmem_query_builder.generate_sparql(question=question)

## CMEMQuery

The query builder return a CMEMQuery object which holds llm predictions and sparql extracts. This allows to refine the SPARQL query as often as you need.

In [5]:
print("Question:\n")
print(question)
print("\n")

print("Prediction:\n")
print(cmem_query.get_last_prediction())
print("\n")

print("SPARQL:\n")
print(cmem_query.get_last_sparql())
print("\n")

Question:

List all services. Limit the results to 20 items.


Prediction:

To answer the user question "List all services. Limit the results to 20 items," we can construct a SPARQL query that retrieves instances of the `Service` class from the provided RDF ontology. The query will also limit the results to 20 items.

### SPARQL Query

```sparql
PREFIX pv: <http://ld.company.org/prod-vocab/>

SELECT ?service ?name
WHERE {
    ?service a pv:Service .
    ?service pv:name ?name .
}
LIMIT 20
```

### Explanation of the Query

1. **PREFIX Declaration**: 
   - We declare the prefix `pv` to refer to the vocabulary defined at `http://ld.company.org/prod-vocab/`. This allows us to use shorter terms in our query.

2. **SELECT Clause**: 
   - We specify that we want to retrieve two variables: `?service` (the service instance) and `?name` (the name of the service).

3. **WHERE Clause**: 
   - We define the conditions for our query:
     - `?service a pv:Service`: This line filters the results to 