In [1]:
import random
import pandas as pd
import numpy as np
import sys
sys.path.append('..')
import doctable as dt

# Type Mappings
DocTable2 provides a simplified interface into the [SQLAlchemy core](https://docs.sqlalchemy.org/en/13/core/) package component (not the object-relational mapping component). With this interface DocTable2 is able to provide an object-oriented interface to execute SQL commands. This package simplifies that interface by working with the various objects within the class, allowing the user to create schemas and perform queries without working with the hundreds of classes required by SQLAlchemy core.

Because of this, it is important to note the interface between them. The first is the type map used to set up the schema. The DocTable2 constructor provides a schema interface which accepts strings as types, so the type map appears here:

In [2]:
dt.DocTable2._type_map

{'biginteger': sqlalchemy.sql.sqltypes.BigInteger,
 'boolean': sqlalchemy.sql.sqltypes.Boolean,
 'date': sqlalchemy.sql.sqltypes.Date,
 'datetime': sqlalchemy.sql.sqltypes.DateTime,
 'enum': sqlalchemy.sql.sqltypes.Enum,
 'float': sqlalchemy.sql.sqltypes.Float,
 'integer': sqlalchemy.sql.sqltypes.Integer,
 'interval': sqlalchemy.sql.sqltypes.Interval,
 'largebinary': sqlalchemy.sql.sqltypes.LargeBinary,
 'numeric': sqlalchemy.sql.sqltypes.Numeric,
 'pickle': sqlalchemy.sql.sqltypes.PickleType,
 'smallinteger': sqlalchemy.sql.sqltypes.SmallInteger,
 'string': sqlalchemy.sql.sqltypes.String,
 'text': sqlalchemy.sql.sqltypes.Text,
 'time': sqlalchemy.sql.sqltypes.Time,
 'unicode': sqlalchemy.sql.sqltypes.Unicode,
 'unicodetext': sqlalchemy.sql.sqltypes.UnicodeText,
 'tokens': doctable.coltypes.TokensType,
 'subdocs': doctable.coltypes.SubdocsType}

It is often also important to add constraints to schemas, and I've also provided a string-mapping for each constraint type.

In [3]:
dt.DocTable2._constraint_map

{'unique_constraint': sqlalchemy.sql.schema.UniqueConstraint,
 'check_constraint': sqlalchemy.sql.schema.CheckConstraint,
 'primarykey_constraint': sqlalchemy.sql.schema.PrimaryKeyConstraint,
 'foreignkey_constraint': sqlalchemy.sql.schema.ForeignKeyConstraint,
 'index': sqlalchemy.sql.schema.Index}

More details about the schema creation can be found in the folder examples/dt2_schema.py

# Init DocTable2 Object
A DocTable2 instance maintains reference to a database. When constructing, you will typically specify a database file name, a table name, and a database engine (or [dialect - sqlite, mysql, etc](https://docs.sqlalchemy.org/en/13/dialects/)). The default database name (```fname```) is ":memory:", a special keyword that will create a database in memory. I use that for most of the examples. The default ```tabname``` is "_documents_", and unless your applications require multiple tables in the same database, specifying one may not be useful. The default ```engine``` is sqlite, and that may be the easiest to work with most of the time. The ```persistent_conn``` parameter will choose whether your application should maintain an open connection to the database (use this if you want to call ```.update()``` in a ```.select()``` loop), or make a new connection every time you attempt to execute a query (use this if multiple threads might try to access the database at the same time). The ```new_db``` flag should be set to False if you are attempting to access a database but do not want to create one if it does not already exist. This prevents the accidental creation of a new database with no rows if it can't find the one you intended to specify. The ```verbose``` flag might be used for demonstration or debugging: it simply requests that sql commands are printed before being executed. This can also be overridden on a per-transaction basis.

In [4]:
schema = (
    ('id','integer',dict(primary_key=True, autoincrement=True)),
    ('name','string', dict(nullable=False)),
    ('age','integer'),
    ('is_old', 'boolean'),
)
db = dt.DocTable2(schema, tabname='mydocuments', verbose=False)
print(db)

<DocTable2::mydocuments ct: 0>


# Notes on DB Interface
DocTable2 allows you to access columns through direct subscripting, then relies on the power of sqlalchemy column objects to do most of the work of constructing queries. Here are a few notes on their use. For more demonstration, see the example in examples/dt2_select.ipynb

In [5]:
# subscript is used to access underlying sqlalchemy column reference (without querying data)
db['id']

Column('id', Integer(), table=<mydocuments>, primary_key=True, nullable=False)

In [6]:
# conditionals are applied directly to the column objects (as we'll see with "where" clause)
db['id'] < 3

<sqlalchemy.sql.elements.BinaryExpression object at 0x7fa2be99eb70>

In [7]:
# can also access using .col() method
db.col('id')

Column('id', Integer(), table=<mydocuments>, primary_key=True, nullable=False)

In [8]:
# to access all column objects (only useful for working directly with sql info)
db.columns

<sqlalchemy.sql.base.ImmutableColumnCollection at 0x7fa2be9e7750>

In [9]:
# to access more detailed schema information
db.schemainfo

[{'name': 'id',
  'type': INTEGER(),
  'nullable': False,
  'default': None,
  'autoincrement': 'auto',
  'primary_key': 1},
 {'name': 'name',
  'type': VARCHAR(),
  'nullable': False,
  'default': None,
  'autoincrement': 'auto',
  'primary_key': 0},
 {'name': 'age',
  'type': INTEGER(),
  'nullable': True,
  'default': None,
  'autoincrement': 'auto',
  'primary_key': 0},
 {'name': 'is_old',
  'type': BOOLEAN(),
  'nullable': True,
  'default': None,
  'autoincrement': 'auto',
  'primary_key': 0}]

In [10]:
# If needed, you can also access the sqlalchemy table object using the .table property.
db.table

Table('mydocuments', MetaData(bind=None), Column('id', Integer(), table=<mydocuments>, primary_key=True, nullable=False), Column('name', String(), table=<mydocuments>, nullable=False), Column('age', Integer(), table=<mydocuments>), Column('is_old', Boolean(), table=<mydocuments>), schema=None)

In [11]:
# the count method is also an easy way to count rows in the database
db.count()

0

In [12]:
# the print method makes it easy to see the table name and total row count
print(db)

<DocTable2::mydocuments ct: 0>
