# Version Control with FolderDB

This notebook demonstrates how to use the version control features of the FolderDB class, including:
- Initializing a database with version control
- Making commits with descriptive messages
- Listing all versions
- Reverting to previous versions

## Setup and Imports

First, let's import the required libraries and set up our environment.

In [1]:
import os
import pandas as pd
from datetime import datetime
from folderdb import FolderDB

## Initialize Database

Let's create a folder for our database and initialize the FolderDB instance.

In [2]:
# Create a folder for our database
db_folder = "version_control_db"
if not os.path.exists(db_folder):
    os.makedirs(db_folder)

# Initialize the database
db = FolderDB(db_folder)
print(f"Created FolderDB at: {db_folder}")

Created FolderDB at: version_control_db


## Create Initial Data

Let's create some sample DataFrames with user and order information.

In [3]:
# Create users DataFrame
users_df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'city': ['New York', 'London', 'Paris']
}, index=['user1', 'user2', 'user3'])

# Create orders DataFrame
orders_df = pd.DataFrame({
    'product': ['Laptop', 'Phone', 'Tablet'],
    'price': [1000, 500, 300],
    'quantity': [1, 2, 3]
}, index=['order1', 'order2', 'order3'])

print("Users DataFrame:")
display(users_df)
print("\nOrders DataFrame:")
display(orders_df)

Users DataFrame:


Unnamed: 0,name,age,city
user1,Alice,25,New York
user2,Bob,30,London
user3,Charlie,35,Paris



Orders DataFrame:


Unnamed: 0,product,price,quantity
order1,Laptop,1000,1
order2,Phone,500,2
order3,Tablet,300,3


## Save Initial Data and Make First Commit

Now let's save our DataFrames to the database and create our first commit.

In [4]:
# Save DataFrames to database
print("1. Saving initial data...")
db.upsert_df("users", users_df)
db.upsert_df("orders", orders_df)

# Commit initial state
print("\n2. Committing initial state...")
db.commit("Initial commit with users and orders data")

print("\nDatabase state after initial commit:")
print(str(db))

1. Saving initial data...

2. Committing initial state...
Initialized git repository in version_control_db
Committing changes in the folder: version_control_db
Committed changes with message: Manual Commit: 2025-03-26@17-20 Initial commit with users and orders data
Commit successful.

Database state after initial commit:
FolderDB at version_control_db
--------------------------------------------------
users.jsonl:
  Size: 157 bytes
  Count: 3
  Key range: user1 to user3
  Linted: False
orders.jsonl:
  Size: 171 bytes
  Count: 3
  Key range: order1 to order3
  Linted: False


## Make Changes and Create Second Commit

Let's make some changes to our data and create another commit.

In [5]:
# Make some changes
print("3. Making changes...")

# Add a new order
new_order = pd.DataFrame({
    'product': ['Smart Watch'],
    'price': [200],
    'quantity': [1]
}, index=['order4'])
db.upsert_df("orders", new_order)

# Update a user
updated_user = pd.DataFrame({
    'name': ['Alice Smith'],
    'age': [26],
    'city': ['Boston']
}, index=['user1'])
db.upsert_df("users", updated_user)

# Commit changes
db.commit("Added new order and updated user information")

print("\nDatabase state after changes:")
print(str(db))

3. Making changes...
Committing changes in the folder: version_control_db
Committed changes with message: Manual Commit: 2025-03-26@17-21 Added new order and updated user information
Commit successful.

Database state after changes:
FolderDB at version_control_db
--------------------------------------------------
users.jsonl:
  Size: 215 bytes
  Count: 3
  Key range: user1 to user3
  Linted: False
orders.jsonl:
  Size: 233 bytes
  Count: 4
  Key range: order1 to order4
  Linted: False


## List All Versions

Let's see all the versions we've created.

In [6]:
print("4. Listing all versions:")
versions = db.version()
for hash_value, message in versions.items():
    print(f"{hash_value[:8]}: {message}")

4. Listing all versions:
264bea4f: Manual Commit: 2025-03-26@17-21 Added new order and updated user information
43b7dddf: Manual Commit: 2025-03-26@17-20 Initial commit with users and orders data


## Revert to Initial State

Now let's revert back to our initial state and verify the data.

In [7]:
# Get the initial commit hash
initial_hash = list(versions.keys())[-1]  # Last commit is the first one

# Revert to initial state
print(f"\n5. Reverting to initial state (commit {initial_hash[:8]})...")
db.revert(initial_hash)

# Verify the data is back to original state
print("\n6. Verifying reverted data:")
print("\nUsers:")
display(db.get_df(["users"])["users"])
print("\nOrders:")
display(db.get_df(["orders"])["orders"])


5. Reverting to initial state (commit 43b7dddf)...
Attempting to revert the folder: version_control_db to version: 43b7dddfc0d653d593b942010257cb9795e841b0
Reverted to commit 43b7dddfc0d653d593b942010257cb9795e841b0
Successfully reverted the folder: version_control_db to version: 43b7dddfc0d653d593b942010257cb9795e841b0

6. Verifying reverted data:

Users:


Unnamed: 0,name,age,city
user1,Alice,25,New York
user2,Bob,30,London
user3,Charlie,35,Paris



Orders:


Unnamed: 0,product,price,quantity
order1,Laptop,1000,1
order2,Phone,500,2
order3,Tablet,300,3


## Cleanup

Finally, let's clean up by removing the database folder and its contents.

In [8]:
print("7. Cleaning up...")
for file in os.listdir(db_folder):
    os.remove(os.path.join(db_folder, file))
os.rmdir(db_folder)
print("   - Removed temporary database")

7. Cleaning up...


PermissionError: [WinError 5] Access is denied: 'version_control_db\\.git'