# DocTable Schemas
Your database table column names and types come from a schema class defined using the `@doctable.schema` decorator. In addition to providing a schema definition, this class can be used to encapsulate data when inserting or retrieving from the database. 

At its most basic, your schema class operates like a [dataclass](https://realpython.com/python-data-classes/) that uses slots for efficiency and allows for custom methods that will not affect the database schema.

In [2]:
from datetime import datetime
from pprint import pprint
import pandas as pd

import sys
sys.path.append('..')
import doctable

# Introduction

This is an example of a basic doctable schema. Note the use of the decorator `@doctable.schema`, the inclusion of `__slots__ = []`, and the type hints of the member variables - I will explain each of these later in this document.

This class represents a database schema that includes two columns: `name` (an `int`) and `age` (a `str`).

In [3]:
@doctable.schema
class Record:
    __slots__ = []
    name: str
    age: int

The schema class definition is then provided to the doctable constructor to create the database table. Here we create an in-memory sqlite table and show the schema resulting from our custom class. Note that doctable automatically inferred that `name` should be a `VARCHAR` and `age` should be an `INTEGER` based on the provided type hints.

In [4]:
# the schema that would result from this dataclass:
table = doctable.DocTable(target=':memory:', schema=Record)
table.schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,name,VARCHAR,True,,auto,0
1,age,INTEGER,True,,auto,0


We can also use the schema class to insert data into our `DocTable`. We simply create a new `Record` and pass it to the `DocTable.insert()` method. Using `.head()`, we see the contents of the database so far. Note that you may also pass a dictionary to insert data - this is just one way of inserting data.

In [5]:
new_record = Record(name='Devin Cornell', age=30)
print(new_record)
table.insert(new_record)
table.head()

Record(name='Devin Cornell', age=30)


Unnamed: 0,name,age
0,Devin Cornell,30


And perhaps more usefully, we can use it to encapsulate results from `.select()` queries. Note that the returned object is exactly the same as the one we put in. Slot classes are more memory-efficient than dictionaries for storing data, but there is cpu time overhead from inserting that data into the slots.

In [6]:
first_record = table.select_first()
print(first_record)

Record(name='Devin Cornell', age=30)


But, of course, the data can be returned in its raw format by passing the parameter `as_dataclass=False`.

In [7]:
first_record = table.select_first(as_dataclass=False)
print(first_record)

('Devin Cornell', 30)


# The `doctable.schema` Decorator

The `@doctable.schema` decorator does the work to convert your custom class into a schema class. It transforms your schema class in three ways:

1. **create slots**: First, [slot](https://docs.python.org/3/reference/datamodel.html#slots) variable names will be added to `__slots__` automatically based on the fields in your class definition. This is why the default functionality requires you to add `__slots__ = []` with no variable names. You may also turn slots off by passing `require_slots=False` to the decorator (i.e. `@doctable.schema(require_slots=False)`), otherwise an exception will be raised.

2. **convert to dataclass**: Second, your schema class will be converted to a [dataclass](https://realpython.com/python-data-classes/) that generates `__init__`, `__repr__`, and other boilerplate methods meant for classes that primarily store data. Any keyword arguments passed to the `schema` decorator, with the exception of `require_slots`, will be passed directly to the `@dataclasses.dataclass` decorator so you have control over the dataclass definition.

3. **inherit from `DocTableSchema`**: Lastly, your schema class will inherit from `doctable.DocTableSchema`, which provides additional accessors that are used for storage in a `DocTable` and fine-grained control over retreived data. More on this later.


Column names and types will be inferred from the type hints in your schema class definition. Because `DocTable` is built on [sqlalchemy core](https://docs.sqlalchemy.org/en/14/core/), all fields will eventually be converted to [`sqlalchemy` column objects](https://docs.sqlalchemy.org/en/13/core/type_basics.html) and added to the DocTable metadata. This table shows the type mappings implemented in doctable:

In [8]:
doctable.python_to_slqlchemy_type

{int: sqlalchemy.sql.sqltypes.Integer,
 float: sqlalchemy.sql.sqltypes.Float,
 str: sqlalchemy.sql.sqltypes.String,
 bool: sqlalchemy.sql.sqltypes.Boolean,
 datetime.datetime: sqlalchemy.sql.sqltypes.DateTime,
 datetime.time: sqlalchemy.sql.sqltypes.Time,
 datetime.date: sqlalchemy.sql.sqltypes.Date,
 doctable.textmodels.parsetreedoc.ParseTreeDoc: doctable.schemas.custom_coltypes.ParseTreeDocFileType}

For example, see this example of the most basic possible schema class that can be used to create a doctable. We use static defaulted parameters and type hints including `str`, `int`, `datetime`, and `Any`, which you can see are converted to `VARCHAR`, `INTEGER`, `DATETIME`, and `BLOB` column types, respectively. `BLOB` was used because the provided type hint `Any` has no entry in the above table.

In [9]:
from typing import Any
import datetime

@doctable.schema
class Record:
    __slots__ = []
    name: str = None
    age: int = None
    time: datetime.datetime = None
    friends: Any = None

# the schema that would result from this dataclass:
doctable.DocTable(target=':memory:', schema=Record).schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,name,VARCHAR,True,,auto,0
1,age,INTEGER,True,,auto,0
2,time,DATETIME,True,,auto,0
3,friends,BLOB,True,,auto,0


You can see that this class operates much like a regular dataclass with slots. Thus, these defaulted parameters are applied in the constructor of the schema class, and _NOT_ as the default value in the database schema.

In [10]:
Record('Devin Cornell', 30)

Record(name='Devin Cornell', age=30, time=None, friends=None)

# Use `doctable.Col` For More Control Over Schema Creation

Using `doctable.Col()` as a default value in the schema class definition can give you more control over schema definitions. 

Firstly, this function returns a dataclass [`field`](https://docs.python.org/3/library/dataclasses.html#dataclasses.field) object that can be used to set parameters like `default_factory` or `compare` as used by the dataclass. Pass arguments meant for `field` through the `Col` parameter `field_kwargs=dict(..)`. Other data passed to `Col` will be used to create the `DocTable` schema, which is stored as metadata inside the `field`.

This example shows how `Col` can be used to set some parameters meant for `field`. These will affect your schema class behavior without affecting the produced DocTable schema.

In [11]:
@doctable.schema
class Record:
    __slots__ = []
    name: str = doctable.Col()
    age: int = doctable.Col(field_kwargs=dict(default_factory=list, compare=True))

Record()

Record(age=[])

`Col` also allows you to explicitly specify a column type using a string, sqlalchemy type definition, or sqlalchemy instance passed to `column_type`. You can then pass arguments meant for the sqlalchemy type constructor through `type_kwargs`. You may also use `type_kwargs` with the column type inferred from the type hint.

In [12]:
import sqlalchemy

@doctable.schema
class Record:
    __slots__ = []
    
    # providing only the type as first argument
    age: int = doctable.Col(sqlalchemy.BigInteger)

    # these are all quivalent
    name1: str = doctable.Col(type_kwargs=dict(length=100)) # infers type from type hint
    name2: str = doctable.Col(sqlalchemy.String, type_kwargs=dict(length=100)) # accepts provided type sqlalchemy.String, pass parameters through type_kwargs
    name3: str = doctable.Col(sqlalchemy.String(length=100)) # accepts type instance (no need for type_kwargs this way)
    name4: str = doctable.Col('string', type_kwargs=dict(length=100))
    

# the schema that would result from this dataclass:
doctable.DocTable(target=':memory:', schema=Record).schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,age,BIGINT,True,,auto,0
1,name1,VARCHAR(100),True,,auto,0
2,name2,VARCHAR(100),True,,auto,0
3,name3,VARCHAR(100),True,,auto,0
4,name4,VARCHAR(100),True,,auto,0


A full list of string -> sqlalchemy type mappings is shown below:

In [13]:
doctable.string_to_sqlalchemy_type

{'biginteger': sqlalchemy.sql.sqltypes.BigInteger,
 'boolean': sqlalchemy.sql.sqltypes.Boolean,
 'date': sqlalchemy.sql.sqltypes.Date,
 'datetime': sqlalchemy.sql.sqltypes.DateTime,
 'enum': sqlalchemy.sql.sqltypes.Enum,
 'float': sqlalchemy.sql.sqltypes.Float,
 'integer': sqlalchemy.sql.sqltypes.Integer,
 'interval': sqlalchemy.sql.sqltypes.Interval,
 'largebinary': sqlalchemy.sql.sqltypes.LargeBinary,
 'numeric': sqlalchemy.sql.sqltypes.Numeric,
 'smallinteger': sqlalchemy.sql.sqltypes.SmallInteger,
 'string': sqlalchemy.sql.sqltypes.String,
 'text': sqlalchemy.sql.sqltypes.Text,
 'time': sqlalchemy.sql.sqltypes.Time,
 'unicode': sqlalchemy.sql.sqltypes.Unicode,
 'unicodetext': sqlalchemy.sql.sqltypes.UnicodeText,
 'json': doctable.schemas.custom_coltypes.JSONType,
 'pickle': doctable.schemas.custom_coltypes.CpickleType,
 'parsetree': doctable.schemas.custom_coltypes.ParseTreeDocFileType,
 'picklefile': doctable.schemas.custom_coltypes.PickleFileType,
 'textfile': doctable.schemas.cu

Finally, `Col` allows you to pass keyword arguments directly to the sqlalchemy `Column` constructor. This includes flags like `primary_key` or `default`, which are both used to construct the database schema but do not affect the python dataclass. Note that I recreated the classic `id` column below.

In [14]:
@doctable.schema
class Record:
    __slots__ = []
    id: int = doctable.Col(primary_key=True, autoincrement=True)
    age: int = doctable.Col(nullable=False)
    name: str = doctable.Col(default='MISSING_NAME')

# the schema that would result from this dataclass:
doctable.DocTable(target=':memory:', schema=Record).schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,id,INTEGER,False,,auto,1
1,age,INTEGER,False,,auto,0
2,name,VARCHAR,True,,auto,0


I also included some shortcut `Col` functions like `IDCol`, `AddedCol`, and `UpdatedCol` - see below.

In [15]:
import datetime

@doctable.schema
class Record:
    __slots__ = []
    id: int = doctable.IDCol() # auto-increment primary key
    added: datetime.datetime = doctable.AddedCol() # record when row was added
    updated: datetime.datetime = doctable.UpdatedCol() # record when row was updated

doctable.DocTable(target=':memory:', schema=Record).schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,id,INTEGER,False,,auto,1
1,added,DATETIME,True,,auto,0
2,updated,DATETIME,True,,auto,0


In this way, `Col` allows you to give fine-grained control to both the schema class behavior and the sql schema definition.

# Working With Schema Objects

Using `Col` default parameters also has some additional side effects, primarily due to the inherited class `DocTableSchema`. Among other things, the `Col` method defines the default dataclass value to be a `doctable.EmptyValue()` object, which is essentially a placeholder for data that was not inserted into the class upon construction. The `__repr__` defined in `DocTableSchema` dictates that member objects containing this value not appear when printing the class, and furthermore, member variables with the value `EmptyValue()` will not be provided in the database insertion. This means that the database schema is allowed to use its own default value - an effect which is most obviously useful when inserting an object that does not have an `id` or other automatically provided values.

The example below shows the `new_record.id` contains `EmptyValue()` as a default, and that the `id` column is not included in the insert query - only `name`.

In [16]:
@doctable.schema
class Record:
    __slots__ = []
    id: int = doctable.IDCol()
    name: str = doctable.Col()

new_record = Record(name='Devin Cornell')
print(new_record, new_record.id)

table = doctable.DocTable(target=':memory:', schema=Record, verbose=True)
table.insert(new_record)
table.head()

Record(name='Devin Cornell') EmptyValue()
DocTable: INSERT OR FAIL INTO _documents_ (name) VALUES (?)
DocTable: SELECT _documents_.id, _documents_.name 
FROM _documents_
 LIMIT ? OFFSET ?


Unnamed: 0,id,name
0,1,Devin Cornell


Yet when we go to retrieve the inserted data, we can see that the value has been replaced by the defaulted value in the database. This is a useful feature if your pipeline involves the insertion of schema objects directly (as opposed to inserting dictionaries for each row).

In [17]:
table.select_first(verbose=False)

Record(id=1, name='Devin Cornell')

The `EmptyValue()` feature is also useful when issuing select queries involving only a subset of columns. See here we run a select query where we just retrieve the name data, yet the result is still stored in a `Record` object.

In [18]:
returned_record = table.select_first(['name'], verbose=False)
print(returned_record, returned_record.id)

Record(name='Devin Cornell') EmptyValue()


To avoid working with `EmptyValue()` objects directly, it is recommended that you use the `__getitem__` string subscripting to access column data. When using this subscript, the schema object will raise an exception if the returned value is an `EmptyValue()`.

In [19]:
try:
    returned_record['id']
except KeyError as e:
    print(e)

'The column "id" was not retreived in the select statement.'


# Indices and Constraints

Indices and constraints are provided to the `DocTable` constructor or definition, as it is not part of the schema class. Here I create custom schema and table definitions where the table has some defined indices and constraints. `doctable.Index` is really just a direct reference to `sqlalchemy.Index`, and `doctable.Constraint` is a mapping to an sqlalchemy constraint type, with the first argument indicating which one.

In [27]:
@doctable.schema
class Record:
    __slots__ = []
    id: int = doctable.IDCol()
    name: str = doctable.Col()
    age: int = doctable.Col()

class RecordTable(doctable.DocTable):
    _tabname_ = 'records'
    _schema_ = Record

    # table indices
    _indices_ = (
        doctable.Index('name_index', 'name'),
        doctable.Index('name_age_index', 'name', 'age', unique=True),
    )
    
    # table constraints
    _constraints_ = (
        doctable.Constraint('unique', 'name', 'age', name='name_age_constraint'),
        doctable.Constraint('check', 'age > 0', name='check_age'),
    )

table = RecordTable(target=':memory:')

(sqlite3.IntegrityError) CHECK constraint failed: check_age
[SQL: INSERT OR FAIL INTO records (age) VALUES (?)]
[parameters: (-1,)]
(Background on this error at: http://sqlalche.me/e/13/gkpj)


And we can see that the constraints are working when we try to insert a record where age is less than 1.

In [28]:
try:
    table.insert(Record(age=-1))
except sqlalchemy.exc.IntegrityError as e:
    print(e)

(sqlite3.IntegrityError) CHECK constraint failed: check_age
[SQL: INSERT OR FAIL INTO records (age) VALUES (?)]
[parameters: (-1,)]
(Background on this error at: http://sqlalche.me/e/13/gkpj)


This is a full list of the mappings between constraint names and the associated sqlalchemy objects.

In [22]:
doctable.constraint_lookup

{'check': sqlalchemy.sql.schema.CheckConstraint,
 'unique': sqlalchemy.sql.schema.UniqueConstraint,
 'primarykey': sqlalchemy.sql.schema.PrimaryKeyConstraint,
 'foreignkey': sqlalchemy.sql.schema.ForeignKeyConstraint}

In [None]:
@doctable.schema
class Record:
    __slots__ = []
    name: str = doctable.Col()

table = doctable.DocTable(target=':memory:', schema=Record, verbose=True)
table.insert(Record())
table.head()

DocTable: INSERT OR FAIL INTO _documents_ DEFAULT VALUES
DocTable: SELECT _documents_.name 
FROM _documents_
 LIMIT ? OFFSET ?


Unnamed: 0,name
0,


But when we insert a `Record` object with a defined `name` parameter, the SQL query shows that some data is being sent to the database.

In [None]:
table.insert(Record(name='Devin J. Cornell'))
table.head()

DocTable: INSERT OR FAIL INTO _documents_ (name) VALUES (?)
DocTable: SELECT _documents_.name 
FROM _documents_
 LIMIT ? OFFSET ?


Unnamed: 0,name
0,
1,Devin J. Cornell


Why does this happen? The answer is that the base class `doctable.DocTableSchema` is doing some work to ignore missing data when inserting data. The key is that that `Col` defaults member variables to use a `doctable.EmptyValue()` object, which `DocTable` will ignore when inserting data. In this way, omitting data in the dataclass constructor will allow the database to use it's defaulted value.

In [None]:
r = Record()
r.name

EmptyValue()

Omitting data in the `Record` constructor results in the database using its own default. In the case of the `id` column, it creates an auto-incremented value to be used as the primary key. We can observe that the database is populating that field by inserting the empty Record into the database and retrieving it again. Note that its values have been defaulted using the database schema.

In [None]:
@doctable.schema
class Record:
    __slots__ = []
    id: int = doctable.Col(primary_key=True, autoincrement=True)
    name: str = doctable.Col(default='whatever')

table = doctable.DocTable(target=':memory:', schema=Record)

r = Record()
print(r)
table.insert(r)

table.select_first()

Record()


Record(id=1, name='whatever')

## Retrieving data as schema objects

As is implicit in the above example, we can also retrieve data as doctable schema objects.

In [None]:
So we can default the value used in the dataclass constructor, or default the value in the database schema. Because no data was provided to `id` in the `Record` constructor, the data was not sent to the database. For this reason, when the database column is not nullable, has no default, and the database recieves no information, an exception will result.

SyntaxError: invalid syntax (<ipython-input-20-0dad5d920e84>, line 1)

In [None]:
@doctable.schema
class Record:
    __slots__ = []
    friends: str = doctable.Col(default='MISSING_NAME_1', field_kwargs=dict(default='MISSING_NAME_2'))

Record()

Record(friends='MISSING_NAME_2')

One side-effect of using `Col` is that by default, `Col` sets the dataclass' default value to `doctable.EmptyValue()`.

In [None]:
r = Record(id=1)
r.age

EmptyValue()



`doctable.DocTableSchema` can take advantage of to indicate when some data is missing from the object when being accessed through string subscripting. The base class defines `__getattr__` so that you can access member variables using string subscripting.

In [None]:
@doctable.schema
class Record:
    __slots__ = []
    age: int = doctable.Col(nullable=False)

# the schema that would result from this dataclass:
doctable.DocTable(target=':memory:', schema=Record).schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,age,INTEGER,False,,auto,0


And here I pass the `default` parameter to `sqlalchemy.Column`.

In [None]:
@doctable.schema
class Record:
    __slots__ = []
    name: str = doctable.Col(default='NO_NAME')
        
# the schema that would result from this dataclass:
table = doctable.DocTable(target=':memory:', schema=Record)
table.insert(Record())
table.head()

Unnamed: 0,name
0,NO_NAME


In [None]:
And, of course, create important index columns.

SyntaxError: invalid syntax (<ipython-input-11-79af6f9be005>, line 1)

In [None]:
@doctable.row
class Record:
    __slots__ = []
    id: int = doctable.Col(primary_key=True, autoincrement=True) # can also use doctable.IDCol() as a shortcut
    name: str = doctable.Col()
    age: int = doctable.Col()

# the schema that would result from this dataclass:
doctable.DocTable(target=':memory:', schema=Record).schema_table()

Record(id='Devin Cornell')

In [None]:
To add functionality beyond dataclasses, doctable provides a `Col` function that can be used as the default value in the class definition. This allows you to access both doctable schema features and pass parameters to the `dataclasses.field` function. Using `Col` also sets the default value to a special class called `doctable.EmptyValue`, which, among other things, will not show up in the class `__repr__`. Notice that I use `Col` to create the `id` column we have come to expect in most sql tables.

There are several other custom column types I included for convenience.

In [None]:
import datetime
@doctable.row
class Record:
    __slots__ = []
    id: int = doctable.IDCol() # auto-increment primary key
    added: datetime = doctable. AddedCol() # record when row was added
    updated: datetime = doctable.UpdatedCol() # record when row was updated
    name: str = doctable.Col()
    is_old: bool = doctable.Col()

# the schema that would result from this dataclass:
doctable.DocTable(target=':memory:', schema=Record).schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,id,INTEGER,False,,auto,1
1,added,BLOB,True,,auto,0
2,updated,BLOB,True,,auto,0
3,name,VARCHAR,True,,auto,0
4,is_old,BOOLEAN,True,,auto,0


{'biginteger': sqlalchemy.sql.sqltypes.BigInteger,
 'boolean': sqlalchemy.sql.sqltypes.Boolean,
 'date': sqlalchemy.sql.sqltypes.Date,
 'datetime': sqlalchemy.sql.sqltypes.DateTime,
 'enum': sqlalchemy.sql.sqltypes.Enum,
 'float': sqlalchemy.sql.sqltypes.Float,
 'integer': sqlalchemy.sql.sqltypes.Integer,
 'interval': sqlalchemy.sql.sqltypes.Interval,
 'largebinary': sqlalchemy.sql.sqltypes.LargeBinary,
 'numeric': sqlalchemy.sql.sqltypes.Numeric,
 'smallinteger': sqlalchemy.sql.sqltypes.SmallInteger,
 'string': sqlalchemy.sql.sqltypes.String,
 'text': sqlalchemy.sql.sqltypes.Text,
 'time': sqlalchemy.sql.sqltypes.Time,
 'unicode': sqlalchemy.sql.sqltypes.Unicode,
 'unicodetext': sqlalchemy.sql.sqltypes.UnicodeText,
 'json': doctable.schemas.custom_coltypes.JSONType,
 'pickle': doctable.schemas.custom_coltypes.CpickleType,
 'parsetree': doctable.schemas.custom_coltypes.ParseTreeDocFileType,
 'picklefile': doctable.schemas.custom_coltypes.PickleFileType,
 'textfile': doctable.schemas.cu

In [None]:
from datetime import datetime

@doctable.row
class Record:
    
    # custom doctable column types
    id: int = doctable.IDCol() # auto-increment primary key
    added: datetime = doctable. AddedCol() # record when row was added
    updated: datetime = doctable.UpdatedCol() # record when row was updated
    
    # generic column object.
    # Keyword arguments are passed directly to sqlalchemy Column constructor
    name: str = doctable.Col(nullable=False)
    
    # first argument is default value or factory (automatically determined)
    num_siblings: int = doctable.Col(0)
        
    # this will be stored as a binary type in sql
    friends: list = doctable.Col(list)
    
    # can also use regular scalar default values
    age: int = 6
    is_old: bool = None
        
    # indices and constraints - these are used by DocTableRow objects
    _indices_ = {
        # SQLAlchemy: Index('name_index', 'name')
        'name_index': ('name',),
        
        # SQLAlchemy: Index('name_age_index', 'name', 'age', unique=True)
        'name_age_index': ('name', 'age', {'unique':True}),
    }
    
    # add constraints to table
    _constraints_ = (
        
        #SQLAlchemy:  UniqueConstraint('name', 'age')
        ('unique', 'name', 'age'),
        
        #SQLAlchemy: CheckConstraint('age > 0', name='check_age')
        ('check', 'age > 0', {'name':'check_age'}), 
        
        #('foreignkey', ('a','b'), ('c','d')),
    )
        
        
    # doctable method to execute after constructor is created
    def __post_init__(self):
        self.is_old = age > 28
        
    # any custom method the user would like to add
    @property
    def num_friends(self):
        return len(self.friends)
    
    
db = doctable.DocTable(target=':memory:', schema=Record)
db.schema_table()

SlotsRequiredError: Slots must be enabled by including "__slots__ = []".                 Otherwise set doctable.row(require_slots=False).

## List Schemas
And this is another example showing the list schema format.

In [None]:
schema = (
    # standard id column
    #SQLAlchemy: Column('id', Integer, primary_key = True, autoincrement=True), 
    ('integer', 'id', dict(primary_key=True, autoincrement=True)),
    # short form (can't provide any additional args though): ('idcol', 'id')

    # make a category column with two options: "FICTION" and "NONFICTION"
    #SQLAlchemy: Column('title', String,)
    ('string', 'category', dict(nullable=False)),

    # make a non-null title column
    #SQLAlchemy: Column('title', String,)
    ('string', 'title', dict(nullable=False)),

    # make an abstract where the default is an empty string instead of null
    #SQLAlchemy: Column('abstract', String, default='')
    ('string', 'abstract',dict(default='')),

    # make an age column where age must be greater than zero
    #SQLAlchemy: Column('abstract', Integer)
    ('integer', 'age'),

    # make a column that keeps track of column updates
    #SQLAlchemy: Column('updated_on', DateTime(), default=datetime.now, onupdate=datetime.now)
    ('datetime', 'updated_on',  dict(default=datetime.now, onupdate=datetime.now)),
    # short form to auto-record update date: ('date_updated', 'updated_on')
    
    #SQLAlchemy: Column('updated_on', DateTime(), default=datetime.now)
    ('datetime', 'updated_on',  dict(default=datetime.now)),
    # short form to auto-record insertion date: ('date_added', 'added_on')

    # make a string column with max of 500 characters
    #SQLAlchemy: Column('abstract', String, default='')
    ('string', 'text',dict(),dict(length=500)),

    
    ##### Custom DocTable Column Types #####
    
    # uses json.dump to convert python object to json when storing and
    # json.load to convert json back to python when querying
    ('json','json_data'),
    
    # stores pickled python object directly in table as BLOB
    # TokensType and ParagraphsType are defined in doctable/coltypes.py
    # SQLAlchemy: Column('tokenized', TokensType), Column('sentencized', ParagraphsType)
    ('pickle','tokenized'),
    
    # store pickled data into a separate file, recording only filename directly in table
    # the 'fpath' argument can specify where the files should be placed, but by
    # default they are stored in <dbname>_<tablename>_<columnname>
    #('picklefile', 'pickle_obj', dict(), dict(fpath='folder_for_picklefiles')),
    
    # very similar to above, but use only when storing text data
    #('textfile', 'text_file'), # similar to above
    
    
    ##### Constraints #####
    
    #SQLAlchemy: CheckConstraint('category in ("FICTION","NONFICTION")', name='salary_check')
    ('check_constraint', 'category in ("FICTION","NONFICTION")', dict(name='salary_check')),
    
    #SQLAlchemy: CheckConstraint('age > 0')
    ('check_constraint', 'age > 0'),
    
    # make sure each category/title entry is unique
    #SQLAlchemy:  UniqueConstraint('category', 'title', name='work_key')
    ('unique_constraint', ['category','title'], dict(name='work_key')),
    
    # makes a foreign key from the 'subkey' column of this table to the 'id'
    # column of ANOTHERDOCTABLE, setting the SQL onupdate and ondelete foreign key constraints
    #('foreignkey_constraint', [['subkey'], [ANOTHERDOCTABLE['id']]], {}, dict(onupdate="CASCADE", ondelete="CASCADE")),
    #NOTE: Can't show here because we didn't make ANOTHERDOCTABLE
    
    ##### Indexes ######
    
    # make index table
    # SQLAlchemy: Index('ind0', 'category', 'title', unique=True)
    ('index', 'ind0', ('category','title'),dict(unique=True)),
    
)
md = doctable.DocTable(target=':memory:', schema=schema, verbose=True)
md.schema_table()

Unnamed: 0,name,type,nullable,default,autoincrement,primary_key
0,id,INTEGER,False,,auto,1
1,category,VARCHAR,False,,auto,0
2,title,VARCHAR,False,,auto,0
3,abstract,VARCHAR,True,,auto,0
4,age,INTEGER,True,,auto,0
5,updated_on,DATETIME,True,,auto,0
6,text,VARCHAR(500),True,,auto,0
7,json_data,VARCHAR,True,,auto,0
8,tokenized,BLOB,True,,auto,0
