# How to...use SqlEngine to manipulate data into and from a database

This notebook shows how to use the entitymatching api to manipulate data into and from a local database. The API has the capability to create a database in a local folder, create tables and execute all the basic SQL operations (insert, delete, update and select).

In [None]:
# Sets up the location of the api relative to this notebook 
import sys
sys.path.append('../../../')

In [None]:
# Import the module for accessing a database
from entitymatching.dbmanager import SqlEngine as DbEngine

## 1. Connect to a database

The string connection below creates a local database based on sqlite, that must be installed in the user's computer. Check the link https://www.tutorialspoint.com/sqlite/sqlite_installation.htm for more information on how to install and use sqlite. 

In [None]:
# Localization of the database to be created in relation to this jupyter notebook
# The database will be created in the /data/dabase folder, under the project main folder (EntityMatching)
path_db = '../../../data/database/'

In [None]:
# String connection used for sqlite. Others databases might require different information.
# In this example the connection is a combination of [sqlite statement] + [database path] + [database name]
str_connection = 'sqlite:///' + path_db + 'entitymatching.db'
str_connection

In [None]:
# The database engine object is created by passing the string connection 
sqlengine_obj = DbEngine.SqlEngine(str_connection)

In [None]:
# The connect() method of the SqlEngine is used to stablish a connection with the database if it exists, 
# or to create a new one, otherwise. The parameter show_eco is False by default and indicates if the SQL statements 
# are echoed (or printed) in the default output channel. Therefore, let's set show_echo = True to see the Sql statements. 
sqlengine_obj.connect(show_echo=True)

In [None]:
# Check if the connection was stablished
sqlengine_obj.is_connected()

## 2. Create a table

In [None]:
# Prepare to create a table
table1_name = 'table1'
columns_names = ['isin', 'lei', 'company_name']

In [None]:
# The create_table method requires a name for table and a list with all column names and returns a table object. 
# The parameter add_idx creates an autoincrement column that will be the table's primary key.
table1_obj = sqlengine_obj.create_table(table1_name, columns_names, add_idx=True)

In [None]:
# Check if the table was created successfully
is_table_created = sqlengine_obj.table_exists(table1_name)
is_table_created

In [None]:
# Check the type of table object
type(table1_obj)

## 3. Insert data in a table

In [None]:
# The data to be inserted is a dictionary in which the key is the column name and the value is its content.
data = {"isin": "SK1120005824", "lei": "097900BHK10000084115", "company_name": "CENTRAL PERK"}

In [None]:
# The insert_row() method adds the content to the table
sqlengine_obj.insert_row(table1_obj, data)

In [None]:
# It is possible to query the table and check the values in it
result = sqlengine_obj.query_table(table1_obj)
result

In [None]:
# The result of the query is a list of tupples. Therefore, recovering the values individually is an easy task
print('isin: {}'.format(result[0][1]))
print('lei: {}'.format(result[0][2]))
print('company_name: {}'.format(result[0][3]))

## 4. Insert multiple rows in a table

In [None]:
# To insert multiple rows, just add multiple dictionary of rows in a list and pass the list to the insert method 
data = []
data.append({"isin": "DE0005545503", "lei": "5299003VKVDCUPSS5X23", "company_name": "DUNDER MIFFLIN"})
data.append({"isin": "GB00B1YW4409", "lei": "254900B1P3S786KDAW57", "company_name": "HONEYDUKES"})
data

In [None]:
# The insert_row() method adds the multiple rows to the table
sqlengine_obj.insert_row(table1_obj, data)

In [None]:
# It is possible to query the table and check the values in it
result = sqlengine_obj.query_table(table1_obj)
result

## 5. Select join...

In [None]:
# Prepare to create a table
table2_name = 'table2'
columns_names = ['isin', 'company_name']

In [None]:
# Create table2
table2_obj = sqlengine_obj.create_table(table2_name, columns_names, add_idx=True)

In [None]:
# To insert multiple rows, just add multiple dictionary of rows in a list and pass the list to the insert method 
data = []
data.append({"isin": "SK1120005824", "company_name": "CENTRAL PERK"})
data.append({"isin": "DE0005545503", "company_name": "DUNDER MIFFLIN"})

In [None]:
data

In [None]:
# The insert_row() method adds the multiple rows to the table
sqlengine_obj.insert_row(table2_obj, data)

In [None]:
# Checking the values
result = sqlengine_obj.query_table(table2_obj)
result

In [None]:
for column in table2_obj.columns:
    print(type(column))

In [None]:
# Performing a join between the two tables
query = session.query(User, Document, DocumentsPermissions).join(Document).join(DocumentsPermissions)

## 6. Drop tables 

In [None]:
sqlengine_obj.drop_table(table1_obj)

In [None]:
sqlengine_obj.drop_table(table2_obj)

## 7. Disconnect from database

In [None]:
sqlengine_obj.disconnect()

In [None]:
sqlengine_obj.is_connected()