A Wikidata-view over Large Language Models.
KIF is a framework designed for integrating diverse knowledge sources, including RDF-based interfaces, relational databases and, CSV files. It leverages the Wikidata's data model and vocabulary to expose a unified view of the integrated sources. The result is a virtual knowledge base which behaves like an "extended Wikidata" that can be queried through a lightweight query interface. More details about KIF can be found in this paper.
For a data source to be accessed via KIF filters, i.e. KIF's query interface, it is necessary to create a Store that, based on user-defined mappings, will enable access to the underlying data source in its native language.
LLM Store is a KIF Store whose underlying data sources are LLMs. Therefore, when issuing filters to LLM Store, it will be transformed into prompts that will probe the underlying LLM.
LLM Store is powered by LangChain!
pip install kif-llm-store
- Clone this repository:
git clone https://github.com/IBM/kif-llm-store
cd kif-llm-store- Create a virtual environment, activate it, and install the requirements:
python -m venv venvsource venv/bin/activatepip install -r requirements.txt- Set the environment variables
LLM_API_KEY=your_api_key
LLM_API_ENDPOINT=platform_endpoint
To instantiate an LLM Store it is necessary to indicate a LLM provider to access models. The LLM provider can be open_ai to access models from OpenAI, ibm to access models from IBM WatsonX, and ollama to access models from Ollama. Depending on the plataform selected you need to provide the credentials to access it.
# Import KIF namespacee
from kif_lib import *
# Import LLM Store main abstraction
from kif_llm_store import LLM_Store
# Import LLM Providers identifiers to set the LLM Provider in which the LLM Store will run over.
from kif_llm_store.store.llm.constants import LLM_Providers# Using IBM WatsonX models
kb = Store(LLM_Store.store_name,
llm_provider=LLM_Providers.IBM,
model_id='meta-llama/llama-3-70b-instruct',
api_key=os.environ['LLM_API_KEY'],
base_url=os.environ['LLM_API_ENDPOINT'],
model_params={
'decoding_method': 'greedy',
},
project_id =os.environ['WATSONX_PROJECT_ID'],
)# Using OpenAI models
kb = Store(LLM_Store.store_name,
llm_provider=LLM_Providers.OPEN_AI,
model_id='gpt-4o',
api_key=os.environ['LLM_API_KEY'],
model_params={
'temperature': 0,
},
)As KIF LLM Store uses LangChain, you can instantiate LLM Store direct with a LangChain Chat Model, for instance:
# Import LangChain OpenAI Integration
from langchain_openai import ChatOpenAI
# Instantiate a LangChain model for OpenAI
model = ChatOpenAI(model='gpt-3.5-turbo', api_key=os.environ['LLM_API_KEY'])
# Instantiate a LLM Store passing the model as a parameter
kb = Store(store_name=LLM_Store.store_name, model=model)This approach enables you to run LLM Store with any LangChain Integration not only the models listed in LLM_Providers.
Matches statements where the subject is the Wikidata Item representing the entity Brazil and the property is the Wikidata Property for shares border with. This filter should retrieves statements linking Brazil to other items through the property share a border with it:
stmts = kb.filter(subject=wd.Brazil, property=wd.shares_border_with, limit=10)
for stmt in stmts:
display(stmt)See documentation and examples.
Marcelo Machado, João M. B. Rodrigues, Guilherme Lima, Sandro R. Fiorini, Viviane T. da Silva. ["LLM Store: Leveraging Large Language Models as Sources of Wikidata-Structured Knowledge"]. 2024.
Our LLM Store solution to the ISWC LM-KBC 2024 Challenge can be accessed here.
Released under the Apache-2.0 license.