# Sources
[langchain example]('https://python.langchain.com/docs/modules/data_connection/retrievers/self_query/')

In [13]:
from langchain.chains.query_constructor.base import AttributeInfo, StructuredQueryOutputParser, get_query_constructor_prompt

document_content_description = 'Course information of a learning platform specialized on AI.'

metadata_field_info = [
    AttributeInfo(
        name='language',
        description="The language the course is provided in. One of ['de', 'en']",
        type='string',
    )
]

prompt = get_query_constructor_prompt(
    document_content_description,
    metadata_field_info,
)

prompt.suffix = prompt.suffix.replace('<< Example 3. >>', '<< Now complete the following and stop: >>')

In [11]:
print(prompt.format(query='What courses for AI Ethik in german are available?'))

Your goal is to structure the user's query to match the request schema provided below.

<< Structured Request Schema >>
When responding use a markdown code snippet with a JSON object formatted in the following schema:

```json
{
    "query": string \ text string to compare to document contents
    "filter": string \ logical condition statement for filtering documents
}
```

The query string should contain only text that is expected to match the contents of documents. Any conditions in the filter should not be mentioned in the query as well.

A logical condition statement is composed of one or more comparison and logical operation statements.

A comparison statement takes the form: `comp(attr, val)`:
- `comp` (eq | ne | gt | gte | lt | lte | contain | like | in | nin): comparator
- `attr` (string):  name of attribute to apply the comparison to
- `val` (string): is the comparison value

A logical operation statement takes the form `op(statement1, statement2, ...)`:
- `op` (and | or | not

In [14]:
from langchain.llms.openai import AzureOpenAI
from pathlib import Path
import json

output_parser = StructuredQueryOutputParser.from_components()

with open(Path().resolve().parent/'src'/'keys.json', 'r') as f:
    keys = json.load(f)

llm = AzureOpenAI(azure_endpoint=keys['AZURE_OPENAI_ENDPOINT'], 
                  openai_api_key=keys['AZURE_OPENAI_KEY'], 
                  openai_api_version='2023-05-15', 
                  openai_api_type='azure',
                  deployment_name='gpt-3_5',
                  temperature=0)

query_constructor = prompt | llm | output_parser
query_constructor.invoke(
    {
        'query': 'What courses for AI Ethik are available in german?'
    }
)

OutputParserException: Parsing text
```json
{
    "query": "AI Ethik",
    "filter": "eq(\"language\", \"de\")"
}
```


Data Source:
```json
{
    "content": "Course information of a learning platform specialized on AI.",
    "attributes": {
    "difficulty": {
        "description": "The difficulty of the course. One of ['beginner', 'intermediate', 'advanced']",
        "type": "string"
    },
    "price": {
        "description": "The price of the course in USD",
        "type": "float"
    },
    "duration": {
        "description": "The duration of the course in hours",
        "type": "float"
    }
}
}
```

User Query:
What are the beginner courses that are free and take less than 10 hours?

Structured Request:
```json
{
    "query": "",
    "filter": "and(eq(\"difficulty\", \"beginner\"), eq(\"price\", 0), lt(\"duration\", 10))"
}
```


Data Source:
```json
{
    "content": "Course information of a learning platform specialized on AI.",
    "attributes": {
    "difficulty": {
        "description": "The difficulty of
 raised following error:
Got invalid JSON object. Error: Extra data: line 5 column 1 (char 70)