Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database files: sqlite, dbf etc? #27

Open
jaakla opened this issue Dec 24, 2021 · 5 comments
Open

Database files: sqlite, dbf etc? #27

jaakla opened this issue Dec 24, 2021 · 5 comments

Comments

@jaakla
Copy link

jaakla commented Dec 24, 2021

out of scope?

@wader
Copy link
Owner

wader commented Dec 24, 2021

Hi, it depends a bit what the usage is. The main usage of fq it to provide detailed access to a binary format so formats that have "user facing" (rows and columns etc) format that is different from how it's actually encoded (indexes and various compact value encodings etc) might decode to something a user would not expect. Have a look at #25 where i describe issue and possible solutions to it.

BTW have you looked at https://github.com/neilotoole/sq? is that more close to want your looking for?

@jaakla
Copy link
Author

jaakla commented Dec 26, 2021

I mean access of raw structures. Yep it is more close to real data for simpler/older databases like dbf (my common problem: what is charset?) and can be more tricky for eg sqlite or leveldb.
For advanced bigdata ones like heavily compressed ORC etc it probably really does not make much sense.

@wader
Copy link
Owner

wader commented Dec 26, 2021

Ok! then a fq decoder could be a good fit i think. Is it something you would be interested to work on?

I wonder how hard it would be to have some per format plumbing in jq that traverses the raw structure and output rows etc. Actually jq:s standard library has some SQL-like support already :)

wader added a commit that referenced this issue Dec 28, 2021
See sqlite3.{go,jq} for TODO

Related to #27
@wader
Copy link
Owner

wader commented Dec 28, 2021

Hey, got a bit curious how btree traversing in jq could work so had to just try. Very much work in progress and does not support page overflow yet so probably only works with tables with small rows. Feel free to work on it if you want or if you have any data modeling and naming suggestions.

With this you can do this:

$ cat format/sqlite3/testdata/test.sql
CREATE TABLE aaa (
    cint int primary key,
    cvarchar varchar(30),
    ctext text,
    creal real,
    cblob blob
);
INSERT INTO "aaa" VALUES(0, 'var1', 'text1', 0, "blob1");
INSERT INTO "aaa" VALUES(1, 'var2', 'test2', 1, "blob2");
INSERT INTO "aaa" VALUES(128, 'var3', 'test3', 128, "blob3");
INSERT INTO "aaa" VALUES(-128, 'var3', 'test3', -128, "blob3");
INSERT INTO "aaa" VALUES(9223372036854775807, 'var4', 'test4', 9223372036854775807, "blob4");
INSERT INTO "aaa" VALUES(-9223372036854775808, 'var5', 'test5', -9223372036854775808, "blob5");

$ go run fq.go -d sqlite3 '.header, .pages[0] | d' format/sqlite3/testdata/test.db
    │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.header{}:
0x00│53 51 4c 69 74 65 20 66 6f 72 6d 61│SQLite forma│  magic: "SQLite format 3\x00" (valid)
0x0c│74 20 33 00                        │t 3.        │
0x0c│            10 00                  │    ..      │  page_size: 4096
0x0c│                  01               │      .     │  write_version: "legacy" (1)
0x0c│                     01            │       .    │  read_version: "legacy" (1)
0x0c│                        00         │        .   │  unused_space: 0
0x0c│                           40      │         @  │  maximum_embedded_payload_fraction: 64
0x0c│                              20   │            │  minimum_embedded_payload_fraction: 32
0x0c│                                 20│            │  leaf_payload_fraction: 32
0x18│00 00 00 07                        │....        │  file_change_counter: 7
0x18│            00 00 00 03            │    ....    │  database_size_pages: 3
0x18│                        00 00 00 00│        ....│  page_number_freelist: 0
0x24│00 00 00 00                        │....        │  total_number_freelist: 0
0x24│            00 00 00 01            │    ....    │  schema_cookie: 1
0x24│                        00 00 00 04│        ....│  schema_format_number: 4
0x30│00 00 00 00                        │....        │  default_page_cache_size: 0
0x30│            00 00 00 00            │    ....    │  page_number_largest_root_btree: 0
0x30│                        00 00 00 01│        ....│  text_encoding: "utf8" (1)
0x3c│00 00 00 00                        │....        │  user_version: 0
0x3c│            00 00 00 00            │    ....    │  incremental_vacuum_mode: 0
0x3c│                        00 00 00 00│        ....│  application_id: 0
0x48│00 00 00 00 00 00 00 00 00 00 00 00│............│  reserved: raw bits (all zero)
0x54│00 00 00 00 00 00 00 00            │........    │
0x54│                        00 00 00 07│        ....│  version_valid_for: 7
0x60│00 2e 53 60                        │..S`        │  sqlite_version_number: 3036000
     │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.pages[0]{}:
0x060│            0d                     │    .       │  type: "table_leaf" (13) (Table leaf b-tree page)
0x060│               0f f8               │     ..     │  start_freeblocks: 4088
0x060│                     00 02         │       ..   │  page_cells: 2
0x060│                           0f 44   │         .D │  cell_start: 3908
0x060│                                 00│           .│  cell_fragments: 0
     │                                   │            │  cells_pointers[0:2]:
0x06c│0f 44                              │.D          │    [0]: 3908
0x06c│      0f d1                        │  ..        │    [1]: 4049
     │                                   │            │  cells[0:2]:
     │                                   │            │    [0]{}:
0xf3c│                        81 0a      │        ..  │      payload_len: 138
0xf3c│                              01   │          . │      rowid: 1
     │                                   │            │      payload{}:
0xf3c│                                 07│           .│        length: 7
     │                                   │            │        serials[0:5]:
0xf48│17                                 │.           │          [0]: 23
0xf48│   13                              │ .          │          [1]: 19
0xf48│      13                           │  .         │          [2]: 19
0xf48│         01                        │   .        │          [3]: 1
0xf48│            81 7b                  │    .{      │          [4]: 251
     │                                   │            │        contents[0:5]:
0xf48│                  74 61 62 6c 65   │      table │          [0]: "table" (text)
0xf48│                                 61│           a│          [1]: "aaa" (text)
0xf54│61 61                              │aa          │
0xf54│      61 61 61                     │  aaa       │          [2]: "aaa" (text)
0xf54│               02                  │     .      │          [3]: 2 (8-bit integer)
0xf54│                  43 52 45 41 54 45│      CREATE│          [4]: "CREATE TABLE aaa (\n    cint int primary key,\n    c"... (text)
0xf60│20 54 41 42 4c 45 20 61 61 61 20 28│ TABLE aaa (│
*    │until 0xfd0.7 (119)                │            │
     │                                   │            │    [1]{}:
0xfcc│               25                  │     %      │      payload_len: 37
0xfcc│                  02               │      .     │      rowid: 2
     │                                   │            │      payload{}:
0xfcc│                     06            │       .    │        length: 6
     │                                   │            │        serials[0:5]:
0xfcc│                        17         │        .   │          [0]: 23
0xfcc│                           39      │         9  │          [1]: 57
0xfcc│                              13   │          . │          [2]: 19
0xfcc│                                 01│           .│          [3]: 1
0xfd8│00                                 │.           │          [4]: 0
     │                                   │            │        contents[0:5]:
0xfd8│   69 6e 64 65 78                  │ index      │          [0]: "index" (text)
0xfd8│                  73 71 6c 69 74 65│      sqlite│          [1]: "sqlite_autoindex_aaa_1" (text)
0xfe4│5f 61 75 74 6f 69 6e 64 65 78 5f 61│_autoindex_a│
0xff0│61 61 5f 31                        │aa_1        │
0xff0│            61 61 61               │    aaa     │          [2]: "aaa" (text)
0xff0│                     03            │       .    │          [3]: 3 (8-bit integer)
     │                                   │            │          [4]: "NULL" (null)
     │                                   │            │  freeblocks[0:1]:
     │                                   │            │    [0]{}:
0xff0│                        00 00      │        ..  │      next_offset: 0

$ go run fq.go -d sqlite3 'sqlite3_table("aaa") | ., tovalue' format/sqlite3/testdata/test.db
      │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.pages[1].cells[0].payload.contents[0:5]:
0x1f5c│      80 00 00 00 00 00 00 00      │  ........  │  [0]: -9223372036854775808 (64-bit integer)
0x1f5c│                              76 61│          va│  [1]: "var5" (text)
0x1f68│72 35                              │r5          │
0x1f68│      74 65 73 74 35               │  test5     │  [2]: "test5" (text)
0x1f68│                     c3 e0 00 00 00│       .....│  [3]: -9.223372036854776e+18 (64-bit float)
0x1f74│00 00 00                           │...         │
0x1f74│         62 6c 6f 62 35            │   blob5    │  [4]: "blob5" (text)
[
  -9223372036854775808,
  "var5",
  "test5",
  -9223372036854776000,
  "blob5"
]
      │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.pages[1].cells[1].payload.contents[0:5]:
0x1f80│            7f ff ff ff ff ff ff ff│    ........│  [0]: 9223372036854775807 (64-bit integer)
0x1f8c│76 61 72 34                        │var4        │  [1]: "var4" (text)
0x1f8c│            74 65 73 74 34         │    test4   │  [2]: "test4" (text)
0x1f8c│                           43 e0 00│         C..│  [3]: 9.223372036854776e+18 (64-bit float)
0x1f98│00 00 00 00 00                     │.....       │
0x1f98│               62 6c 6f 62 34      │     blob4  │  [4]: "blob4" (text)
[
  9223372036854775807,
  "var4",
  "test4",
  9223372036854776000,
  "blob4"
]
      │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.pages[1].cells[2].payload.contents[0:5]:
0x1fa4│                  80               │      .     │  [0]: -128 (8-bit integer)
0x1fa4│                     76 61 72 33   │       var3 │  [1]: "var3" (text)
0x1fa4│                                 74│           t│  [2]: "test3" (text)
0x1fb0│65 73 74 33                        │est3        │
0x1fb0│            80                     │    .       │  [3]: -128 (8-bit integer)
0x1fb0│               62 6c 6f 62 33      │     blob3  │  [4]: "blob3" (text)
[
  -128,
  "var3",
  "test3",
  -128,
  "blob3"
]
      │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.pages[1].cells[3].payload.contents[0:5]:
0x1fbc│                  00 80            │      ..    │  [0]: 128 (16-bit integer)
0x1fbc│                        76 61 72 33│        var3│  [1]: "var3" (text)
0x1fc8│74 65 73 74 33                     │test3       │  [2]: "test3" (text)
0x1fc8│               00 80               │     ..     │  [3]: 128 (16-bit integer)
0x1fc8│                     62 6c 6f 62 33│       blob3│  [4]: "blob3" (text)
[
  128,
  "var3",
  "test3",
  128,
  "blob3"
]
      │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.pages[1].cells[4].payload.contents[0:5]:
      │                                   │            │  [0]: 1 (constant 1)
0x1fd4│                        76 61 72 32│        var2│  [1]: "var2" (text)
0x1fe0│74 65 73 74 32                     │test2       │  [2]: "test2" (text)
      │                                   │            │  [3]: 1 (constant 1)
0x1fe0│               62 6c 6f 62 32      │     blob2  │  [4]: "blob2" (text)
[
  1,
  "var2",
  "test2",
  1,
  "blob2"
]
      │00 01 02 03 04 05 06 07 08 09 0a 0b│0123456789ab│.pages[1].cells[5].payload.contents[0:5]:
      │                                   │            │  [0]: 0 (constant 0)
0x1fec│                  76 61 72 31      │      var1  │  [1]: "var1" (text)
0x1fec│                              74 65│          te│  [2]: "text1" (text)
0x1ff8│78 74 31                           │xt1         │
      │                                   │            │  [3]: 0 (constant 0)
0x1ff8│         62 6c 6f 62 31            │   blob1    │  [4]: "blob1" (text)
[
  0,
  "var1",
  "text1",
  0,
  "blob1"
]

wader added a commit that referenced this issue Dec 28, 2021
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Jan 12, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Jan 20, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Jan 20, 2022
See sqlite3.{go,jq} for TODO

Related to #27
@wader
Copy link
Owner

wader commented Jan 22, 2022

Made some progress but still needs work. This works now, i call it "fqlite":

$ go run fq.go --arg name "R.E.M." 'torepr as $db | first($db.artists[] | select(.[1]==$name)) as $artist | $db.albums[] | select(.[2] == $artist[0]) | .[1] | tovalue' format/sqlite3/testdata/chinook.db
"Green"
"New Adventures In Hi-Fi"
"The Best Of R.E.M.: The IRS Years"

wader added a commit that referenced this issue Feb 14, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Mar 9, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue May 3, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue May 26, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Jun 30, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Jul 25, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Aug 18, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Dec 24, 2022
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Apr 13, 2023
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue May 1, 2023
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue May 17, 2023
See sqlite3.{go,jq} for TODO

Related to #27
wader added a commit that referenced this issue Apr 30, 2024
See sqlite3.{go,jq} for TODO

Related to #27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants