# Migrate OMOP v5.4 Vocabulary and Concept Tables to DuckDB

## Overview

This notebook demonstrates how to migrate vocabulary and concept tables from [Broadsea](https://github.com/OHDSI/Broadsea) PostgreSQL loaded with OMOP Vocab to a local DuckDB database.

### Import Libraries

We will use Ibis as the primary library to connect to both PostgreSQL and DuckDB. In addition, we use toml to store variables for connecting to PostgreSQL.

In [None]:
import os
import tomllib
import ibis



# Load the TOML file
with open("config.toml", "rb") as file:
    config = tomllib.load(file)

POSTGRES_USER = config['broadsea_postgresql']['POSTGRES_USER']
POSTGRES_PASSWORD = config['broadsea_postgresql']['POSTGRES_PASSWORD']
POSTGRES_DB = config['broadsea_postgresql']['POSTGRES_DB']
POSTGRES_SCHEMA = config['broadsea_postgresql']['POSTGRES_SCHEMA']
POSTGRES_HOST = config['broadsea_postgresql']['POSTGRES_HOST']
POSTGRES_PORT = config['broadsea_postgresql']['POSTGRES_PORT']


con = ibis.postgres.connect(schema=POSTGRES_SCHEMA, database=POSTGRES_DB, host=POSTGRES_HOST, user=POSTGRES_USER, password=POSTGRES_PASSWORD)

con.list_tables()


### Copy Tables to DuckDB

Next, we will create corresponding tables in DuckDB and copy the tables from the Broadsea PostgreSQL to the local .ddb file. 

In [None]:
import polars as pl
from tqdm import tqdm

def create_duckdb_tables(table_names, con):
    total_tables = len(table_names)
    for table_name in tqdm(table_names, desc='Processing tables', total=total_tables):
        # Assuming `polars_con` is your Polars connection object
        t = con.table(table_name)
        
        # Connect to DuckDB
        duckdb_con = ibis.connect("duckdb://omop54.ddb")
        
        # Create table in DuckDB
        duckdb_con.create_table(table_name, t.to_pyarrow(), overwrite=True)

# List of table names
table_names = [
    "concept",
    "concept_ancestor",
    "concept_class",
    "concept_recommended",
    "concept_relationship",
    "concept_synonym",
    "vocabulary"
]

# Create DuckDB tables
create_duckdb_tables(table_names, con)

In [None]:
voc_db = ibis.connect("duckdb://omop54.ddb")

voc_db.list_tables()

In [None]:
import pandas as pd

concepts_tbl = voc_db.table("concept")

concepts_df = pl.from_arrow(concepts_tbl.head(5).to_pyarrow())
concepts_df