With this repository, we try to get an understanding of the schema of the BindingDB database and export an extract of the data that shall be used to provide input data for a partially LLM based preparation of input for later-stage models.
See the mysql directory for details.
The Jupyter notebook prepare_for_llm.ipynb is the main program for extracting data from BindingDB. Before this notebook is run, the database needs to be prepared as describend in the README.