# DocTable Example: Pickle and Text Files
Here I show a bit about how to use `picklefile` and `textfile` column types. DocTable transparently handles saving and reading column data as separate files when data is large to improve performance of select queries. It will automatically create a folder in the same directory as your sqlite database and save or read file data as if you were working with a regular table entry.

In [1]:
import os
import sys
sys.path.append('..')
import doctable

In [10]:
folder = './tmp'
if not os.path.exists(folder):
    os.mkdir(folder)

# create column schema: each row corresponds to a pickle
import dataclasses
@dataclasses.dataclass
class FileEntry(doctable.DocTableSchema):
    pic: list = doctable.Col(coltype='picklefile', type_args=dict(fpath=folder))
    idx: int = doctable.IDCol()
    
db = doctable.DocTable(schema=FileEntry, target=':memory:')

In [11]:
a = [1, 2, 3, 4, 5]
db.insert(FileEntry(a))
db.select() # regular select using the picklefile datatype

[FileEntry(pic=[1, 2, 3, 4, 5], idx=1)]

Because doctable creates a transparent interface to work with these separate files, we need to use a new database to read the raw table schema and show the filenames that DocTable uses to reference stored data files.

For performance reasons, DocTable never deletes stored file data unless you call the `.clean_col_files()` method directly. It will raise an exception if a referenced file is missing, and delete all files which are not referenced in the table.

In [12]:
# deletes files not in db and raise error if some db files not in filesystem
db.clean_col_files('pic')

ValueError: Schema must be provided if using memory database or database file does not exist yet. Need to provide schema when creating a new table.

Now I create another DocTable with a changed `fpath` argument. Because the argument changed, DocTable will raise an exception when selecting or calling `.clean_col_files()`. Be wary of this!

In [7]:
# now specify fpath in column type arguments.
db2 = dt.DocTable(schema=[('idcol', 'id'),('picklefile', 'pic', dict(), dict(fpath='custom_file_loc'))], fname=fname)
try:
    db2.clean_col_files('pic')
except FileNotFoundError:
    print('threw error because no files were found in the custom_file_loc folder, even though the db has a record.')

threw error because no files were found in the custom_file_loc folder, even though the db has a record.


In [None]:
if not os.path.exists(folder):
    os.mkdir(folder)