First let's import the relevant module and initialize a neural db class.

In [None]:
from thirdai import neural_db as ndb

db = ndb.NeuralDB(user_id="my_user")

At this point, the db is uninitialized. We can either initialize from scratch like this

In [None]:
db.from_scratch()

Or load a checkpoint / base model as follows:

In [None]:
db.from_checkpoint("path/to/checkpoint")

Or, if you really need to, initialize with a UDT model.

In [None]:
from thirdai import bolt

db.from_udt(
    udt=bolt.UniversalDeepTransformer.load("path/to/udt.bolt"),
    id_col="DOC_ID", id_delimiter=":", query_col="QUERY", 
    input_dim=50_000, hidden_dim=2048, extreme_output_dim=50_000)

A database is useless if it doesn't contain anything. So let's insert things into it!

In [None]:
import utils

# @Tharun please change the parameters here.
csv_doc = utils.CSV(
    path="test_cs.csv", 
    weak_columns=["weak1", "wEaK2"], 
    strong_columns=["strong1", "stronk2"], 
    reference_columns=["ref1", "REF2"])

# Just like that!
db.insert([csv_doc])

# Note that this method does not throw since it catches all exceptions.
# To handle errors or progress, you can pass in handles like this
db.insert(
    sources=[csv_doc], 
    on_progress=lambda fraction: print(f"Progress at {fraction * 100}%"),
    on_error=lambda error_msg: print(f"Error! {error_msg}"),
    on_irrecoverable_error=lambda error_msg: print(f"Really bad error! {error_msg}"),
    on_success= lambda: print("SUCCESS!!!"))

# @Tharun I thought about changing this to be db.insert_into_db but I don't think
# a real database would have a method called "insert_into_db" since people know
# it's a database. I strongly suggest that we keep it as is, or maybe rename it 
# to "insert", but I say let's avoid "db" in the method name. Plus, people
# are going to call the neural db object with the variable "db" since that's
# the example we give in this demo, so I think it will be clear enough that
# it's a DB!

Wait, we added the same document multiple times... is that going to be an issue?

Nope! Let's double-check by listing all of our sources.

In [None]:
db.sources() # @Tharun on second thought we should probably call this documents() or list_documents() to be consistent.

Now let's search.

In [None]:
search_results = db.search(
    query="STRONK", # @Tharun feel free to change the query.
    top_k=2,
    on_error=lambda error_msg: print(f"Error! {error_msg}"))

for result in search_results:
    print(result.text())
    print(result.context(radius=3))
    print(result.source())
    print(result.metadata())
    result.show()

Now that we've covered the basics, let's go over some of NeuralDB's advanced features. The first one is text-to-result association. Suppose that out of the top k results above, you actually like the 3rd result best. You can tell the model exactly this by running the following snippet

In [None]:
db.text_to_result(text="STRONK", result_id=search_results[2].id())

Even more interesting than the first one is text-to-text association. This allows you to teach the model that two keywords, phrases, or concepts are related.

In [None]:
db.associate(source="stronk", target="very stronk")

As usual, saving is a one-liner.

In [None]:
db.save("path/to/checkpoint/")
# You can also see the progress
db.save("path/to/checkpoint/", on_progress=lambda fraction: print(f"{fraction}% done with saving."))

# Loading is just like we showed above, with an optional progress handler
db.from_checkpoint("path/to/checkpoint", on_progress=lambda fraction: print(f"{fraction}% done with loading."))

If you decide to start anew, we have just the right method for you.

In [None]:
db.clear_sources()