Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing sqlite database stored in blob #50

Open
yairlenga opened this issue Sep 11, 2022 · 1 comment
Open

Accessing sqlite database stored in blob #50

yairlenga opened this issue Sep 11, 2022 · 1 comment
Assignees

Comments

@yairlenga
Copy link

yairlenga commented Sep 11, 2022

Hi. I might be pushing the boundaries a little bit - I've got interesting use case, which I might be able to solve with a little bit more functionality from sqlite_fdw:

Short Version: Is it possible to extend the sqlite_fdw so that it will work on databases stored inside BLOBs (which can be updated/retrieved via PG client code) ?

Long Version:

In my use case, I have a large number of sqlite data sets (>100K), each associated with a single financial instrument. The individual data sets are merged into X-large master repository (Postgresql). For the purpose of financial simulations, new SQLite data sets are created from the master repository, and are sent to simulations in distributed environment. This is time consuming process. Using SQLite is essential because of the complexity and relationship inside the data.

My question: Is it possible to extend the sqlite FDW so that it will work on BLOBS stored in Postgresql database. This will make it easier to transfer the input sqlite databases into Postgresql - extract the data with fast stored proc. Likewise, the creation of the sqlite data sets for the final analysis will be much faster - it will be processed inside Postgresql, and retrieved as a blob.

Extending the sqlite FDW to allow reading from sqlite blobs/creating sqlite blobs will effectively allow for creating "embedded" databases inside Postgresql, with the advantage that unlimited number of small embedded databases will be supported. I believe that there will be many other application for this functionality - aggregating "mini-databases" into single Postgresql "warehouse", and creating sqlite data sets for processing outside postgresql - in compute grids, etc.

While solving for the generic case (allowing concurrent access to many databases) - I believe that practical solution, where there is a cap on how many concurrent sqlite databases may be opened by a single client connection to is OK. More specific - for my use case - capping the number of "open" sqlite databases to 2 - one for read, one for write. Also, limit of the sqlite database size (e.g. < 10MB) will work for my application.

Curious to hear how much work is involved (I believe technically, this is possible, as sqlite can support both in memory databases, on disk databases, and vfs databases). Also, are there other application for this kind of functionality ?

@t-kataym
Copy link
Contributor

(I might be misunderstanding your comment, please correct me if something is wrong.)

If you create a table with BLOB column on SQLite, you can access it from PostgreSQL by bytea column.

$ sqlite3 /tmp/fdw.db
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table blob_tbl (a INTEGER, b BLOB);
sqlite> insert into blob_tbl values(100, x'0500');
$ ./psql test
psql (14.5)
Type "help" for help.

test=# create extension sqlite_fdw;
CREATE EXTENSION
test=# CREATE SERVER test FOREIGN DATA WRAPPER sqlite_fdw OPTIONS (database '/tmp/fdw.db');
CREATE SERVER
test=# IMPORT FOREIGN SCHEMA public FROM SERVER test INTO public;
IMPORT FOREIGN SCHEMA
test=# \det
   List of foreign tables
 Schema |  Table   | Server 
--------+----------+--------
 public | blob_tbl | test
(1 row)

test=# select * from blob_tbl;
  a  |   b    
-----+--------
 100 | \x0500
(1 row)

test=# \d blob_tbl 
                Foreign table "public.blob_tbl"
 Column |  Type  | Collation | Nullable | Default | FDW options 
--------+--------+-----------+----------+---------+-------------
 a      | bigint |           |          |         | 
 b      | bytea  |           |          |         | 
Server: test
FDW options: ("table" 'blob_tbl')

If this behavior does not fit your expectation, could you explain what you expect to do for sqlite_fdw(and PostgreSQL)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants