# Building a simple database from a small experiment
In this tutorial we'll take a small experiment which includes raw localizations, widefield images, and metadata and build them into a database. The database will exist within an [HDF](https://www.hdfgroup.org/) file. The organization of the data inside the file will be handled by B-Store.

In [1]:
# Import the essential bstore libraries
from bstore import database, parsers

# This is part of Python 3.4 and greater and not part of B-Store
from pathlib import Path

## Before starting: Get the test data
You can get the test data for this tutorial from the B-Store test repository at https://github.com/kmdouglass/bstore_test_files. Clone or download the files and change the filename below to point to the folder *test_experiment_2* within this repository.

In [6]:
dataDirectory = Path('../../bstore_test_files/test_experiment_2/') # ../ means go up one directory level

# Step one: Create a parser to read the datasets
In this step, we'll create a parser that can read the files that are stored inside the test data directory. The default parser that comes with B-Store is called `MMParser` and is short for Micro-Manager parser. This is the parser that we use to read datasets that were generated by Micro-Manager and our own localization computing software.

In a later tutorial, we'll show you how to write a simple parser to parse your own datasets.

In [2]:
# Create the parser
parser = parsers.MMParser()

And that's it! Of course, this step is easy if a parser already exists for your data.

We're also ignorning some optional arguments inside the `MMParser()` constructor, but we'll get to those in a later tutorial.

# Step two: Create the empty database object
The database object is what Python uses to build a database inside an HDF file. When we create the object, we specify a path to the file where the information will be stored.

Note that no file is created until data is actually put into the database.

In [5]:
# The path is relative to this notebook.
# Altnernatively, you could send a Path object to HDFDatabase constructor.
dbName = 'myFirstDatabase.h5'
myDB   = database.HDFDatabase(dbName)

# Step three: Build the database
Now comes the fun part. We build the database by using the HDFDatabase's `build()` method. To do this, we need to send a few required arguments to the method. These are:

1. parser - The parser to use when interpreting the data files
2. searchDirectory - The parent directory containing subdirectories with all the experimental data

There are also a few optional arguments whose defaults we will override to match our data files naming patterns. These optional arguments are

1. locResultsString - A string at the end of all raw localization file names, including the file type
2. locMetadataString - Same as above, but for metadata associated with the localization files
3. widefieldImageString - A string at the end of of the file names of any widefield images in the directory

Finally, there is a boolean argument named `dryRun`. If you set this to True, the build method won't actually create the database. It will however return a structure that tells you what datasets were successfully parsed and capable of insertion into the database. By default, `dryRun` is set to False.

In [9]:
# TODO: Make column names in the test dataset without spaces and use those as the test data
# TODO: Do a dry run build
#myDB.build()