# Visualizing Overlapping Datetime Data

This notebook demonstrates how to visualize multiple JSONL files with overlapping datetime linekeys using the visualization functions.

## Setup and Imports

First, let's import the required libraries and set up our environment.

In [1]:
import os
import sys
import pandas as pd
from datetime import datetime, timedelta
import bokeh as bk


# Add the parent directory to the Python path
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.getcwd()))))

from visual import visualize_jsonl, visualize_folderdb
from folderdb import FolderDB

In [2]:
bk.io.output_notebook()

## Create Sample Data with Overlapping Datetimes

Let's create a sample FolderDB with multiple JSONL files that have overlapping datetime linekeys.

In [3]:
# Create a temporary directory for our database
db_folder = "overlapping_datetime_db"
if not os.path.exists(db_folder):
    os.makedirs(db_folder)

# Initialize the database
db = FolderDB(db_folder)

# Create base datetime range
base_dates = [datetime.now() + timedelta(hours=i) for i in range(24)]

# Create three datasets with overlapping dates
# Dataset 1: Full day with temperature data
temp_data = pd.DataFrame({
    'temperature': [20 + i * 0.5 for i in range(24)],
    'location': ['Room A'] * 24
}, index=base_dates)

# Dataset 2: Overlapping with humidity data
humidity_data = pd.DataFrame({
    'humidity': [40 + i * 2 for i in range(24)],
    'location': ['Room A'] * 24
}, index=base_dates)

# Dataset 3: Partial overlap with pressure data
pressure_dates = [d + timedelta(hours=12) for d in base_dates[:12]]
pressure_data = pd.DataFrame({
    'pressure': [1013 + i * 0.1 for i in range(12)],
    'location': ['Room A'] * 12
}, index=pressure_dates)

# Save the data
db.upsert_df("temperature", temp_data)
db.upsert_df("humidity", humidity_data)
db.upsert_df("pressure", pressure_data)

print("Created sample data:")
print(str(db))

Created sample data:
FolderDB at overlapping_datetime_db
--------------------------------------------------
humidity.jsonl:
  Size: 17280 bytes
  Count: 288
  Key range: 2025-03-26T17:51:14 to 2025-03-27T17:08:23
  Linted: False
pressure.jsonl:
  Size: 9216 bytes
  Count: 144
  Key range: 2025-03-27T05:51:14 to 2025-03-27T17:08:23
  Linted: False
temperature.jsonl:
  Size: 18720 bytes
  Count: 288
  Key range: 2025-03-26T17:51:14 to 2025-03-27T17:08:23
  Linted: False


## Visualize Individual Files

Let's visualize each file separately to see their datetime distributions.

In [4]:
# Visualize temperature data
temp_plot = visualize_jsonl(os.path.join(db_folder, "temperature.jsonl"))
temp_plot.title.text = "Temperature Data Distribution"
bk.io.show(temp_plot)

In [5]:
# Visualize humidity data
humidity_plot = visualize_jsonl(os.path.join(db_folder, "humidity.jsonl"))
humidity_plot.title.text = "Humidity Data Distribution"
bk.io.show(humidity_plot)

In [6]:
# Visualize pressure data
pressure_plot = visualize_jsonl(os.path.join(db_folder, "pressure.jsonl"))
pressure_plot.title.text = "Pressure Data Distribution"
bk.io.show(pressure_plot)

## Visualize All Data Together

Now let's visualize all the data together to see how the overlapping datetime linekeys are handled.

In [7]:
# Visualize the entire database
db_plot = visualize_folderdb(db_folder)
db_plot.title.text = "Overlapping Datetime Data Distribution"
bk.io.show(db_plot)

Found 3 JSONL files in overlapping_datetime_db
Processing humidity.jsonl with 288 entries
Added 288 points for humidity.jsonl
Processing pressure.jsonl with 144 entries
Added 144 points for pressure.jsonl
Processing temperature.jsonl with 288 entries
Added 288 points for temperature.jsonl


## Cleanup

Finally, let's clean up our temporary database.

In [8]:
# Remove the temporary database
for file in os.listdir(db_folder):
    os.remove(os.path.join(db_folder, file))
os.rmdir(db_folder)
print("Cleaned up temporary database")

Cleaned up temporary database
