First let's import the relevant module and initialize a neural db class.

In [4]:
from thirdai import neural_db as ndb

db = ndb.NeuralDB(user_id="my_user")

At this point, the db is uninitialized. We can either initialize from scratch like this

In [5]:
db.from_scratch()

Or load a checkpoint / base model as follows:

In [None]:
db.from_checkpoint("path/to/checkpoint")

A database is useless if it doesn't contain anything. So let's insert things into it!

The insertion method takes in a list of `Document` objects. The `Document` interface is easily extendable. We plan to support some common formats out of the box, but for right now we'll show you how to extend the Document interface.

Phew! *Now* we insert the document.

In [None]:
from utils import CSVDocument

csv_doc = CSVDocument(
    path="sample_nda.csv",
    strong_columns=["passage"],
    weak_columns=["para"],  
    reference_columns=["passage"])

# Just like that!
db.add_documents([csv_doc])

Now let's search.

In [9]:
search_results = db.search(
    query="what is the termination period",
    top_k=2,
    on_error=lambda error_msg: print(f"Error! {error_msg}"))

for result in search_results:
    print(result.text())
    # print(result.source())
    # print(result.metadata())
    # result.show()

12. entire agreement. this agreement constitutes the entire agreement with respect to the subject matter hereof and supersedes all prior agreements and understandings between the parties (whether written or oral) relating to the subject matter and may not be amended or modified except in a writing signed by an authorized representative of both parties. the terms of this agreement relating to the confidentiality and non-use of confidential information shall continue after the termination of this agreement for a period of the longer of (i) five (5) years or (ii) when the confidential information no longer qualifies as a trade secret under applicable law.
13. severability. each party acknowledges that should any provision of this agreement be determined to be void invalid or otherwise unenforceable by any court of competent jurisdiction such determination shall not affect the remaining provisions hereof which shall remain in full force and effect.


In [10]:
search_results = db.search(
    query="parties involved",
    top_k=2,
    on_error=lambda error_msg: print(f"Error! {error_msg}"))

for result in search_results:
    print(result.text())
    # print(result.source())
    # print(result.metadata())
    # result.show()

3. joint undertaking. each party agrees that it will not at any time disclose give or transmit in any manner or for any purpose the confidential information received from the other party to any person firm or corporation or use such confidential information for its own benefit or the benefit of anyone else or for any purpose other than to engage in discussions regarding a possible business relationship or the current business relationship involving both parties.
in witness whereof this agreement has been duly executed by the parties hereto as of the latest date set forth below: acme inc. starwars inc. by: by: name: bugs bunny name: luke skywalker title: ceo title: ceo date: may 5 2023 date: may 7 2023


Oops! looks like when we search for "parties involved", we get the result in the 2nd position but not in the first place. 

No worries, let's go over some of NeuralDB's advanced features. The first one is text-to-text association. This allows you to teach the model that two keywords, phrases, or concepts are related.

Based on the above example, let's teach the model that "parties involved" and the phrase "made by between" are the same.

In [15]:
db.associate(source="parties involved", target="made by and between")

In [16]:
search_results = db.search(
    query="parties involved",
    top_k=2,
)

for result in search_results:
    print(result.text())
    # print(result.source())
    # print(result.metadata())
    # result.show()

confidentiality agreement this confidentiality agreement (the “agreement”) is made by and between acme. dba tothemoon inc. with offices at 2025 guadalupe st. suite 260 austin tx 78705 and starwars dba tothemars with offices at the forest moon of endor and entered as of may 3 2023 (“effective date”).
3. joint undertaking. each party agrees that it will not at any time disclose give or transmit in any manner or for any purpose the confidential information received from the other party to any person firm or corporation or use such confidential information for its own benefit or the benefit of anyone else or for any purpose other than to engage in discussions regarding a possible business relationship or the current business relationship involving both parties.


As usual, saving is a one-liner.

In [None]:
# save your db
db.save("path/to/checkpoint/")

# Loading is just like we showed above, with an optional progress handler
db.from_checkpoint("path/to/checkpoint", on_progress=lambda fraction: print(f"{fraction}% done with loading."))