# SQLAlchemy – NoSQL (optional)

This last course is optional and intended for those of you who have fully understood the entire SQL section. The topics will not be covered in depth, the goal is just to attract your attention and make you aware of their existence. The goal is to give yous some notions in case you are confronted with such subjects in the future, or that you wish to deepen them by yourself in order to perfect your mastery of databases with Python.

## SQLAlchemy

Unlike SQLite, SQLAlchemy is not a database engine or a DBMS, but a database access toolkit. More exactly, SQLAclhemy is known for a component called ORM : Object Relational Mapper, which is the most used one. What is it ? Python is an object-oriented language, and the SQLAlchemy’s ORM will map SQL queries results to objects, which are easierly and more powerfully manipulated by Python. SQLAlchemy is an interface between the database engine and Python, making transparent the manipulation of SQL queries while maintaining their effectiveness.

Some of the classes used by SQLAlchemy are (you will intuitively know to what SQL entities they are linked) :

* Base
* Table
* Column
* ForeignKey
* Integer
* String
* etc.

### Simple : SQLAlchemy to connect to a database

The simpler use of SQLAlchemy is just to declare an engine and use it to connect to an existing database, and refer to this connection with a connector. We could then send queries via this connector with methods like `pd.read_sql_query()` :


In [10]:
from sqlalchemy import create_engine, text
import os

engine = create_engine(os.path.join('sqlite:///','data', 'european-soccer.sqlite'))

In [11]:
import pandas as pd

def request(query, engine=engine):
    with engine.begin() as conn:
        return pd.read_sql_query(text(query), conn)

In [12]:
print(request('SELECT * FROM Country'))

       id         name
0       1      Belgium
1    1729      England
2    4769       France
3    7809      Germany
4   10257        Italy
5   13274  Netherlands
6   15722       Poland
7   17642     Portugal
8   19694     Scotland
9   21518        Spain
10  24558  Switzerland


With SQLAchemy, you can create engine to connect to most of the DBMS : you will have the same interface to connect to a lot of different DBMS. For example, if MySQL runs on your computer, you can connect to a MySQL database as such :

```Python
################################################################
# To connect to a MySQL database running local with SQLAlchemy #
################################################################

user = "<user_name>"
password = "<user_password>"
db_name = "<database_name>"
port = 3306 # default port for MySQL but 
host = "127.0.0.1" # you can also try using "localhost" – if MySQL runs distant, 
                    # host will be the server address
connection_infos = f"mysql+mysqldb://{user}:{password}@{host}:{port}/{db_name}"
engine = create_engine(connection_infos)
```

### Advanced : SQLAlchemy models

If we don’t want to use the SQLAlchemy interface to connect ot a database and not query it with SQL queries, we have to create a model. It will define the mapping between the database tables and queries results and the object defined and used in Python. 

A model is a class, inherited from the `Base()` class, which, as its name tells, models columns making the tables, and relationships between the tables, etc.

Remember our `movies.db` ?

Let’s list its tables and columns, as a refresher :

In [16]:
import sqlite3

######################################################
# List tables and columns of a database (arg = path) #
######################################################

def explore_db(path: 'string'):
    conn = sqlite3.connect(path)
    c = conn.cursor()
    
    list_table = '''
    PRAGMA table_list;
    '''
    c.execute(list_table)
    for row in c.fetchall():
        table = row[1] # to improve readability
    
        # then get columns names
        column_list = 'PRAGMA table_info(' + table +')'
        print('\n----- table ' + row[1] + ' columns -----\n')
        c.execute(column_list)
        
        # print columns names
        for row in  c.fetchall():
            column = row[1] # to improve readability
            print(column)
        
        # finally print first lines of each table   
        print('\n table ' + table + ' first lines :\n')
        select_all = 'SELECT * FROM ' + table + ' LIMIT 5'
        c.execute(select_all)
        for row in  c.fetchall():
            print(row)

In [17]:
explore_db('data/movies.db')


----- table Credits columns -----

Id
Movie_id
Direction
Producer
Studio
Playscreen
Cast
Country

 table Credits first lines :

(1, 3, '"Big director"', '"Big producer"', '"Big studio"', '"Big screenwriter"', '"Big Actor 1, Big Actor 2, Other big actors"', '"Big country"')
(2, 1, '"Unknown director"', '"Unknown producteur"', '"Unknown studio"', '"Unknown screenwriter"', '"Unknown actor 1, Unknown acteur 2, Other unknown actors"', '"Unknown country"')
(3, 2, '"Small director"', '"Small producer"', '"Small studio"', '"Small screenwriter"', '"Small actor 1, Small actor 2, Small other actors"', '"Small country"')
(4, 5, '"Acceptable director"', '"Acceptable producer"', '"Acceptable studio"', '"Acceptable screenwriter"', '"Acceptable actor 1, Acceptable actor 2, Other acceptable actors"', '"Acceptable country"')
(5, 4, '"Incompetent director"', '"Incompetent producer"', '"Incompetent studio"', ' "Incompetent screenwriter"', '"Incompetente actor 1, Incompetent actor 2, Other incompetent act

Nous avons donc deux tables :

* Credits (5 colonnes, cf. ci-dessus)
* Movies (5 colonnes, cf. ci-dessus)

Maintenant que nous nous sommes rafraîchi la mémoire, connectons-nous à la base : 

In [1]:
from sqlalchemy import create_engine, text
import os

engine = create_engine(os.path.join('sqlite:///','data', 'movies.db'))

Et créons le modèle suivant :

In [2]:
from sqlalchemy import Table, Column, ForeignKey, Integer, String, Float
from sqlalchemy.orm import declarative_base

Base = declarative_base() # créons la classe de base dont vont hériter
                          # les classes que nous allons créer pour le modèle

class Movies(Base):    # on instancie la classe modèle pour la table Movies
    __tablename__ = 'Movies' 
    Id = Column(Integer, primary_key=True) # instances des colonnes de la table
    Title = Column(String)
    Date = Column(String)
    Duration = Column(Integer)
    Budget = Column(Integer)
    First_week_viewers = Column(Integer)
    Votes = Column(Float)
    
class Credits(Base):
    __tablename__ = 'Credits'
    Id = Column(Integer, primary_key=True)
    Movie_id = Column(String, ForeignKey('Movies.Id')) # relation entre tables
    Direction = Column(String)
    Producer = Column(String)
    Studio = Column(String)
    Playscreen = Column(String)
    Cast = Column(String)
    Country = Column(String)

Now, the model is declared, connection to the base is established, let’s see how we make a request.

To do that, we have to open a `Session` : it is a 'holding place' as described in the SQLAlchemy documentation, where  
```
it provides the interface where SELECT and other queries are made that will return and modify ORM-mapped objects.
```
(objects that we have instanciated and loaded)

Once the `session` is instanciated, we can call the `.query` method, that we will chain with other methods to create a query.

Here the query that select all the records of the `Movies` table :

In [3]:
from sqlalchemy.orm import Session

with Session(engine) as session:
    results = session.query(Movies).all()

To display the results, we have to simply iterate and check the columns we want to get :

In [4]:
for result in results: 
    print(result.Title, result.Votes) 

A good movie 4.36
Another good movie, slightly better 4.63
A bad movie, but with some success 4.26
A very bad movie 2.86
A not so bad movie 3.86


The method `.filter` is used to create conditions of selection :

In [6]:
with Session(engine) as session:
    results = session.query(Movies).filter(Movies.Votes > 4.0).all()
    for result in results: 
        print(result.Title, result.Votes) 

A good movie 4.36
Another good movie, slightly better 4.63
A bad movie, but with some success 4.26


Others methods to write requests :
* `.group_by()`
* `.count()`
* `.order_by()`
* `.join()`
* etc.

You can read the [query guide](https://docs.sqlalchemy.org/en/14/orm/queryguide.html) in the SQLAlchemy documentation.

You may ask : isn’t it a bit complicated just to write a query ? It is a high-level approach that respond to specific problems (do not use SQL, OOP, agnostic of the DBMS, etc.). Writing SQL directly is more a low-level approach, which comes with its set of difficulties : lack of flexibility, written for a precise DBMS, no-OOP, etc.). 

As previously said, this is not an extensive lecture about SQLAlchemy, but the idea is to mention the existence of this library, in case you are working on a project that uses it or requires it.

## NoSQL

### Limits of the relational model

As we have learned, SQL was designed to query relational databases. This kind of database has numerous advantages :

* relational model make easy the querying on relationships between data belongings to multiple tables
* data are stored in a well structured manner : the structure of the data model and data type is defined before the actual manipulation of the data
* structure brings constraints that garanties that storage is secure and robust (very low risk of error). You can’t delete or add feature and data that would brings incoherence in the dataset (delete a column that is a foreign key in another table, create a column with a datatype or default value that conclicts with already existing data or structure definition, etc.)
* [ACID](https://en.wikipedia.org/wiki/ACID) compliant :
    * Atomicity : a query is executed with success, or not at all (if some error occured during the exectution). For example you can’t start an UPDATE and stop it before it is fully processed with some data modified and other not. That prevents the apparition of inconsistancies.
    * Consistancy : the dataset is valid before a query is processed, and valid once it has been processed. If you write new data in the dataset, it must be valid according to all defined rules of the model
    * Isolation : when several users access to the database at the same time, the DBMS has to deal with concurrent queries (that write and read the same data at the same time). The isolation principle ensures that at the end the database is in the same state as if it would have been requested sequentially (one query after the other)
    * Durability : long term storage is secure : commited data can’t be lost
  The ACID principle allows the possibility to perform complex operations in one single query, like joins operation

But these qualities can become defects in some cases :

* the emphasis – and necessity – on the model and the predefined structure lengthens the development time of such databases
* another difficulty due to this dependance to structure is that it can’t manage unstructured data or data whose characteristics are not known in advance
* the implementation, management or administration of relational databases is monolithic : they are hard to scale horizontally, they are easier to upscale or scale vertically

  We have here to present some terminology : 
    * horizontal scaling is the process where the capacity of treatment of a database is obtained by adding other servers or node managing the data
    * upscaling or vertical scaling is the process where the capacity of treatment of a database is obtained by upscaling the capacity of the server managing the database (more memory, more computing power, more storage capacity, etc.) wich is more expansive, demanding, and you can’t upscale forever (there is a physical limit, max memory or CPU, etc.)

  When a database gets bigger, we may want to divides the data in smaller entities :
    * horizontal partitioning : the records (lines) belonging to the same table can be distributed between several tables. For example a customers table could be divided between several tables, each table gathering the cutomers of a specific city (one table for Marseille, one for Paris, etc.) each table having records with the same columns. It’s easier to make an horizontal scaling in this case : you can add servers that manage determined tables. The problems is that the schema become confused (several tables with same structured records) especially when you have to write data and control constraints (reading is far simpler to deal with)
    * vertical partitioning : a table is splitted along its columns ("rows splitting" : rows get splitted). For example a table customers containing customers id (Firstname, Lastname, birthday…), adress, orders, could be divided between an  id table, an address table and an orders table
    * sharding : it is similar to horizontal partitioning, but shards (partition) go beyond that. In horizontal partitioning, there is only one schema, sharding implies that rows are distributed between several tables, but that it occurs between several instances of the schema. Each shard is totally independant and can be heberged on different servers, datacenters, etc.

This lead to the idea that whe data gets big, and really big, there is a need of flexibility in the database schema. The relational model reaches its limits.

### NoSQL 

NoSQL means "Not only SQL", and not "Not SQL"! That rather refers to "no relational model". Exemple of NoSQL solutions (disclaimer, the categorization is not as strict as presented, it’s just examples to fix the idea) : 
* MongoDB, CouchBase (documents - more or less complex JSON - rather than rows)
* Redis, Amazon DynamoDB (key-values)
* Cassandra, Big Table, Accumulo (columns)
* Amazon Neptune, Neo4j (graphs)

NoSQL deals with the limits exposed just before :

* NoSQL is flexible because *it just does not support* relationships between tables (simple!). So it can deal with unstructured data or data whose type is not well known (or totally unknown) before we build the database. The structures in NoSQL can be :
    * an unstructured document (JSON, BSON, XML, etc.)
    * a pair key-value object (particularly efficient for unique but complex values)
    * a table (columns records rather than rows : efficient if queries only use few columns)
    * graphs
    * time series
    * etc.
* NoSQL do not follow a unique concept or schema, but it covers different types of non relational databases that correspond to different usecases
* a NoSQL database can be built dynamically : schema has not to be defined before we begin to manage data. Moreover, documents belonging to the same collection can have different types (for example, the key-value documents do not need to have the same keys). Sometimes it is static, for example if we deal with a lot of columns oriented documents (tables).
* dealing with unstructured data makes the sharding easier, and therefore the building of distributed databases and horizontal scaling far more easier. NoSQL engines are optimized to operate in highly distributed environment (datacenters scale)
* in fact NoSQL database are generally used to build distributed database with large amount of data. On part of this process is the replication of shards of data from one node to others (replicas). That  is really useful when a lot of clients want to access to the data at the same time, to manage the load balance. But this operation takes time. This leads to a high risk of inconsistancy if one client access to a data in a replica that has not yet been updated (that’s more a problem of distributed databases than specifically NoSQL, but NoSQL are generally distributed databases).
*  NoSQL is not specified to be ACID compliant, and there is no definition of a guaranteed simple way of performing a JOIN operation in one single request, for exemple (as there is no relation…). It does not mean that all NoSQL solutions are not ACID compliant (some of them are)
*  Some NoSQL solutions can create in-memory database only (no physical storage of data in files)

Moreover be careful. Know what you’re doing. Relational databases can also be dynamic, distributed, deal with JSON or exotic data structures. It’s just easier with NoSQL tools. On the contrary, as MongoDB is very easy to use, some has the reflex to use systematically a MongoDB databases, even in situations where they end-up building a NoSQL database that follows a relational model… that’s nonsense.

### UnQLite

UnQLite is an embedded NoSQL component comparable to SQLite (no need to install a third party server, lightweight - <1.8Mo). It was originally developped for Java, but a binding in Python can be installed with `pip`, along with Cython (for the binding). UnQLite is a serverless (like SQLite) JSON document store built on a fast key/value database. It has a specific scripting language to manage the JSON document store : Jx9. Nevertheless UnQLite is ACID compliant. It is a single file database (like SQLite), wirtten in C with no dependency (like SQLite), cross-platform (for the fileformat) and BSD licensed. UnQLite documentation is [here](https://unqlite-python.readthedocs.io/en/latest/).

Cython is a language very close to Python, to which it adds support for some instructions in C/C++. It simplify the coding of extensions for Python.

#### unqlite installation

In [11]:
!pip install Cython unqlite



In [2]:
import unqlite

In [3]:
from unqlite import UnQLite

#### Basic operations

For a first try, let’s create a database which stands in memory only (just do not specify a filename) : 

In [4]:
database = UnQLite()

To add an element to the database, you juste have to assign a value to a key in the database :

In [5]:
database['key'] = 'value'
'key' in database

True

In [11]:
database.exists('key')

True

The `.fetch()` method allows us to retrieve a value stored by calling its key :

In [8]:
database.fetch('key')

b'value'

Attention, in unqlite, everything – even text – is of the « byte string » type, a serie of values that not necessarily make sense to humans. In case of text, even if you believe you can read it, it is no guaranteed you have to decode it according to a standard of encoding : utf-8, or cp1252, ASCII… 

In [30]:
database.fetch('key').decode('utf-8')

'value'

`.store()` is a method to add a key-value pair to the database :

In [10]:
database.store('key0', 'value0')
database.fetch('key0')

b'value0'

`.append()`, without surprise, appends a value to the data stored with the given key (if no data is associated to the key, it become just an equivalent to `.store()`) :

In [12]:
database.append('key0', '_value_appended')
database.fetch('key0')

b'value0value_appended'

Another way to update the database is to pass a dictionnary to it :

In [9]:
database.update(
    {'key1': 'value1',
     'key2': 'value2'
    }
)

database.fetch('key2')

b'value2'

And, finally, you can drop a key-value pair with `.delete()` :

In [13]:
database.delete('key0')
database.exists('key0')

False

#### Keys - values

NoSQL databases are often used to store data in the form of key - value pairs. Let’s create a database exemple that follows this schema :

In [16]:
# Ouvrir (ou créer) une base de données UnQLite
users_db = unqlite.UnQLite()

# Insérer des paires clé-valeur
users_db['user_1'] = {"name": "Alice", "age": 25}
users_db['user_2'] = {"name": "Bob", "email": "bob@example.com"}
users_db['user_3'] = {"name": "Charlie", "adress": {"city": "Paris", "zip_code": 75001}}
users_db['config'] = {"theme": "dark", "lang": "fr"}


Get the keys of the base (as we could get the columns name in a relational database) : 

In [19]:
list(users_db.keys()) # ATTENTION, we have to cast the result of the .keys() method to list to print it



['user_1', 'user_2', 'user_3', 'config']

We could also write :

In [24]:
for user in users_db.keys():
    print(user)

user_1
user_2
user_3
config


The result of the `.keys()` method is an iterable.

What interests us are the values :

In [31]:
for value in users_db.values():
    print(value.decode('utf-8'))

# or
# list(users_db.values())

{'name': 'Alice', 'age': 25}
{'name': 'Bob', 'email': 'bob@example.com'}
{'name': 'Charlie', 'adress': {'city': 'Paris', 'zip_code': 75001}}
{'theme': 'dark', 'lang': 'fr'}


We have a `.keys()` and a `.values()` methods as for dictionnaries (and we also saw `.update()`before), do we have an `.items()` method too ?

In [32]:
for key, value in db.items():
    print(f"{key}: {value.decode('utf-8')}")


utilisateur_1: {'nom': 'Alice', 'age': 25}
utilisateur_2: {'nom': 'Bob', 'email': 'bob@example.com'}
utilisateur_3: {'nom': 'Charlie', 'adresse': {'ville': 'Paris', 'code_postal': 75001}}
config: {'theme': 'dark', 'langue': 'fr'}


If the database behave like a dictionnary (or JSON), it is easy to select and filter datas. For example let’s build a students database and select the students with grades above 10 :

In [9]:
import json

# due to format pitfalls with unqlite, let’s create a decode_json function
def decode_json(value: 'byte string') -> 'json':
    return json.loads(value.decode('utf-8').replace("'", "\""))

students_db = unqlite.UnQLite()

# create the db
students_db["student_1"] = {"name": "Alice", "grade": 14}
students_db["student_2"] = {"name": "Bob", "grade": 9}
students_db["student_3"] = {"name": "Charlie", "grade": 16}
students_db["student_4"] = {"name": "David", "grade": 7}
students_db["student_5"] = {"name": "Eve", "grade": 12}

# select students with grade > 10
successful_students = [
    decode_json(value) 
    for key, value in students_db.items() 
    if decode_json(value)["grade"] > 10
]

In [10]:
successful_students

[{'name': 'Alice', 'grade': 14},
 {'name': 'Charlie', 'grade': 16},
 {'name': 'Eve', 'grade': 12}]

#### Collections

Besides of key-values data, another type often used in NoSQL are collections and documents. A *collection* is a set of documents which are JSON objects. Each document can have a different structure. It’s something very different and specific compared to relational databases.

In [68]:
users_db = unqlite.UnQLite()

# connect to a collection (will be created if doesn’t exist yet - weird : you can instanciate something that doesn’t 'exist')
collection = users_db.collection('users')
if not collection.exists():
    collection.create() 
    
# add different documents (differents structures)
collection.store([
    {"name": "Alice", "age": 25},  # 2 fields : name, age
    {"name": "Bob", "email": "bob@example.com"},  # 2 fields : name, email (no age)
    {"name": "Charlie", "adress": {"city": "Paris", "zip_code": 75001}},  # nested object/JSON
    {"name": "David", "hobbies": ["Music", "Code"]},  # 2 fields : name, list
    {"name": "Eve", "age": 30, "email": "eve@example.com", "admin": True}  # 3 fields
])

# fetch all documents of the collection
documents = collection.all()
print("'Users' collection documents :\n")
for doc in documents:
    print(doc)

'Users' collection documents :

{'name': 'Alice', 'age': 25, '__id': 0}
{'name': 'Bob', 'email': 'bob@example.com', '__id': 1}
{'name': 'Charlie', 'adress': {'city': 'Paris', 'zip_code': 75001}, '__id': 2}
{'name': 'David', 'hobbies': ['Music', 'Code'], '__id': 3}
{'name': 'Eve', 'age': 30, 'email': 'eve@example.com', 'admin': True, '__id': 4}


Question : Python already manage dictionnaries or JSON. What’s the advantage of a database engine here ? As we have already said multiple times, database management systems offers security, robustness and ease in the data storage process. For example, we will present in the two sections below some mechanisms or tools that help to achieve these goals.

#### Cursor

We have used cursor with SQLite, but we did not explain why there is a cursor and why we use it. We presented cursors as read heads that moves through the database, as We move them in the database to where we want to perform our operations (read, write, get informations…). What’s the point with this method ? We have to keep in sight that a cursor is a mean to access to data sequentially, which has several advantages :

* saves memory : we do not load all data in memory at once
* preserves performance : in (very) large databases, it is completely inefficient to retrieve all elements stored
* fine grained control : we can stop browsing data at any time, avoiding the loading of the remaining data into memory

In a previous example we have used a method like `.items()` to retrieve key-value pairs, but it loads all the data in memory which is, for the reasons mentionned, inefficient.

We will use a cursor instead, which will allow us to browse the data gradually, without loading everything. When there are many entries, we will avoid slowdowns in processing. In addition, it will allow us to add stopping conditions (e.g. stopping after a certain number of results). You can consider cursors as iterators, particularly efficient with large amount of key-value pairs data.

Let’s see how cursors works with our exemple database `students_db` :

In [54]:
# it is recommanded to use .cursor() to define a context
with students_db.cursor() as cursor:

    # .first() method is used to place the cursor on the first element
    cursor.first()

    # .key() method returns the key of the element the cursor is pointing to
    k = cursor.key()
    print('key of the first element :', k)

    # .value() method returns the value of the element the cursor is pointing to
    v = cursor.value()
    print('\nvalue of the first element:', decode_json(v))

    # .next_entry() method move the cursor to next element in the database
    cursor.next_entry()
    k = cursor.key()
    print('\nkey of the next element :', k)    

    # .last() method move the cursor to the last element in the database
    cursor.last()
    k = cursor.key()
    print('\nkey of the last element :', k)

    # .previous_entry() method move the cursor to previous element in the database
    cursor.previous_entry()
    k = cursor.key()
    print('\nkey of the previous element :', k)
    
    # .reset() method move the cursor to previous element in the database
    cursor.reset()
    k = cursor.key()
    print('\nkey of the element pointed after a reset :', k)

    # .seek(key_seeked) method move the cursor to the element with key_seeked key
    cursor.seek('student_3')
    k = cursor.key()
    print('\nkey of the element seeked (student_3) :', k)

    # .fetch_until(stop_key) method iterate from the current element to the element with stop_key key
    it = [(k, decode_json(v)) for k, v in cursor.fetch_until('student_4')]
    print('\niteration from the (student_3) to the (student_4) elements:\n', it)

key of the first element : student_1

value of the first element: {'name': 'Alice', 'grade': 14}

key of the next element : student_2

key of the last element : student_5

key of the previous element : student_4

key of the element pointed after a reset : student_1

key of the element seeked (student_3) : student_3

iteration from the (student_3) to the (student_4) elements:
 [('student_3', {'name': 'Charlie', 'grade': 16}), ('student_4', {'name': 'David', 'grade': 7})]


See the [doc](https://unqlite-python.readthedocs.io/en/latest/api.html) for others methods.

Now let’s select the students with grades above 10 using a cursor :

In [59]:
with students_db.cursor() as cursor:
    while True:
        # get the value of the current cursor
        student = decode_json(cursor.value())
    
        # check if grade > 10
        if student['grade'] > 10:
            print(f"{student['name']} - Note : {student['grade']}")
    
        # go to next record
        cursor.next_entry()
        if not cursor.is_valid():  # return False if we go past the last element (invalid position)
            break

Alice - Note : 14
Charlie - Note : 16
Eve - Note : 12


#### Transactions

We never talked about the concept of transaction although it is important in database management.

A database transaction is an « atomic operation » (or « unit of work », see this [wikipedia’s article](https://en.wikipedia.org/wiki/Unit_of_work)) on a database : any operation – as little it can be – which can affect the database. A transaction must be defined and implemented in a way that it can recover when errors occur to preserve the database integrity. Another concern is when several operations are asked at the same time (concurrency). This lead to concepts of isolation, consistency, atomicity… and you recognise the ACID properties mentioned above.

ACID is important even in the NoSQL world, because large and distributed databases, precisely because of their size and distributed design, are more susceptible to error in the event of concurrent access. 

One way to view transactions from an operational perspective is the example of double-entry accounting. These operations/transactions are designed in a way that makes them at the same time a means of controlling their own validity : a debit transaction on one account must be offset by a credit transaction on another account. If this is not the case, the transaction is cancelled.

* Debit 100€ to a supplier account
* Credit 100€ to checking account

UnQLite for example, offers a system of transaction : you can define a context for a transaction (recommanded), when this transaction start, what instructions it is made of, and – it is essential – have a method to cancel it (`.rollback()`) if something went wrong (error raised, condition not respected, etc.).

IMPORTANT : this transaction mecanism works only on databases stored on files (no effects on memory-only).

Imagine we want to add a new record to a collection, but something goes wrong during the recording operation, so we cancel it :


In [78]:
transactions_db = unqlite.UnQLite('transactions_demo.db') # transactions only work for db stored in file

# Let’s create a collection base

collection = transactions_db.collection('transactions')

if not collection.exists():
    collection.create() 

collection.store([
    {"name": "Alice", "age": 25},
    {"name": "Bob", "email": "bob@example.com"},
    {"name": "Charlie", "adress": {"city": "Paris", "zip_code": 75001}},
    {"name": "David", "hobbies": ["Music", "Code"]},
    {"name": "Eve", "age": 30, "email": "eve@example.com", "admin": True}
])
transactions_db.commit() # don’t forget to commit to file

# Now, let’s add a record, securing the process with a transaction :
    
with transactions_db.transaction():
    try:
        # start a transaction
        transactions_db.begin()
        
        # add a record
        collection.store([
            {"name": "John", "age": 25}
        ])
        
        # Raise an error ! (simulated here, of course)
        raise ValueError("Some error occurs during the recording process!")
    
        # commit the insertion of a new record
        transactions_db.commit()
        
    # if an error occured, rollback the transaction :
    except Exception as e:
        print(f"Erreur détectée : {e}")
        
        transactions_db.rollback()
        print('rollback done')
    
    # Let’s see if the record was inserted ?
    documents = collection.all()
    print("\nDocuments in the collection (after the rollback) :")
    for doc in documents:
        print(doc)

Erreur détectée : Some error occurs during the recording process!
rollback done

Documents in the collection (after the rollback) :
{'name': 'Alice', 'age': 25, '__id': 0}
{'name': 'Bob', 'email': 'bob@example.com', '__id': 1}
{'name': 'Charlie', 'adress': {'city': 'Paris', 'zip_code': 75001}, '__id': 2}
{'name': 'David', 'hobbies': ['Music', 'Code'], '__id': 3}
{'name': 'Eve', 'age': 30, 'email': 'eve@example.com', 'admin': True, '__id': 4}


No new element where added !