# Load a knowledge base into an empty DAS

This notebook shows how to start an empty DAS and load a knowledge base into it.

The first cell just imports the relevant class and instantiates a DAS object.

In [None]:
from das.distributed_atom_space import DistributedAtomSpace
import warnings
# avoids an annoying warning message from the Couchbase lib
warnings.filterwarnings('ignore')
das = DistributedAtomSpace()

Point `KNOWLEDGE_BASE` to the file or folder where the knowledge base is. No tarballs or zip files here, only plain `.metta` or `.scm` files or folders with multiple files.

In [None]:
KNOWLEDGE_BASE = "/tmp/samples"

Select between the two load methods according to the knowledge base format. `load_canonical_knowledge_base()` is a lot faster but can be used only for `.metta` files which follows some extra assumptions:

- The DBs are empty.
- All MeTTa files have exactly one toplevel expression per line.
- There are no empty lines.
- Every "named" expressions (e.g. nodes) mentioned in a given
  expression is already mentioned in a typedef (i.e. something
  like '(: "my_node_name" my_type)' previously IN THE SAME FILE).
- Every type mentioned in a typedef is already defined IN THE SAME FILE.
- All expressions are normalized (regarding separators, parenthesis etc)
  like '(: "my_node_name" my_type)' or
  '(Evaluation "name" (Evaluation "name" (List "name" "name")))'. No tabs,
  no double spaces, no spaces after '(', etc.
- All typedefs appear before any regular expressions
- Among typedefs, any terminal types (e.g. '(: "my_node_name" my_type)') appear
  after all actual type definitions (e.g. '(: Concept Type)')
- No "(" or ")" in atom names
- Flat type hierarchy (i.e. all types inherit from Type)

Usually, "canonical" files are generated automatically by some conversion tool (e.g. `flybase2metta`)

If the knowledge base is `.scm` or regular `.metta` file(s) then you should select `das.load_knowledge_base()`

In [None]:
das.clear_database()
das.load_knowledge_base(KNOWLEDGE_BASE)
#das.load_canonical_knowledge_base(KNOWLEDGE_BASE)

If the knowledge base is large, the load can take a time to finish. You can follow the progress by looking at `/tmp/das.log`. Once it's done, you should execute a count just to make sure it worked OK.

In [None]:
das.count_atoms()