Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schemaless data support #1046

Merged
merged 36 commits into from
Dec 25, 2022
Merged

Schemaless data support #1046

merged 36 commits into from
Dec 25, 2022

Conversation

panarch
Copy link
Member

@panarch panarch commented Dec 12, 2022

Convert Row to enum with Vec and Map variants.
Make column_defs in Schema optional.
..
e.g.

CREATE TABLE Foo;
INSERT INTO Foo VALUES ('{ "a": "hello", "b": true }'), ('{ "a": 100 }');
SELECT a FROM Foo;

Convert Row to enum with Vec and Map variants.
Make column_defs in Schema optional.
@panarch panarch added the enhancement New feature or request label Dec 12, 2022
@panarch panarch self-assigned this Dec 12, 2022
@coveralls
Copy link

coveralls commented Dec 12, 2022

Pull Request Test Coverage Report for Build 3775416308

  • 974 of 1017 (95.77%) changed or added relevant lines in 39 files are covered.
  • 28 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.1%) to 98.522%

Changes Missing Coverage Covered Lines Changed/Added Lines %
core/src/executor/validate.rs 4 5 80.0%
core/src/plan/validate.rs 3 4 75.0%
core/src/result.rs 0 1 0.0%
core/src/data/row.rs 21 23 91.3%
core/src/executor/evaluate/evaluated.rs 6 8 75.0%
core/src/executor/context/row_context.rs 8 11 72.73%
core/src/executor/insert.rs 112 115 97.39%
core/src/executor/fetch.rs 90 94 95.74%
core/src/executor/evaluate/stateless.rs 27 32 84.38%
core/src/store/data_row.rs 12 18 66.67%
Files with Coverage Reduction New Missed Lines %
cli/src/print.rs 1 95.55%
core/src/executor/insert.rs 1 96.3%
core/src/executor/aggregate/mod.rs 26 86.39%
Totals Coverage Status
Change from base Build 3756284685: -0.1%
Covered Lines: 37455
Relevant Lines: 38017

💛 - Coveralls

e.g.
CREATE TABLE Foo;
INSERT INTO Foo VALUES ('{ "a": "hello", "b": true }'), ('{ "a": 100 }');
SELECT a FROM Foo;
if columns are not provided, then now it prints without empty brackets.
e.g.
CREATE TABLE Foo; -- columns: vec![]
Also add sample test cases to test-suite
Query planner does not work for schemaless data for now. TBD in following works.
@panarch panarch marked this pull request as ready for review December 21, 2022 06:43
@panarch panarch changed the title WIP. Schemaless data support Schemaless data support Dec 21, 2022
core/src/data/row.rs Outdated Show resolved Hide resolved
core/src/data/row.rs Outdated Show resolved Hide resolved
core/src/executor/fetch.rs Outdated Show resolved Hide resolved
core/src/executor/insert.rs Outdated Show resolved Hide resolved
core/src/executor/validate.rs Show resolved Hide resolved
@panarch panarch requested a review from ever0de December 24, 2022 08:12
"SELECT
Player.id AS player_id,
Player.name AS player_name,
Item.obj['cost'] AS item_cost
Copy link
Collaborator

@devgony devgony Dec 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we cannot unnest nested Map if identical key has both Map and other type

gluesql> CREATE TABLE Items;
Table created

gluesql> INSERT INTO Items VALUES ('{"a": {"x": 1}}');
1 row inserted

gluesql> INSERT INTO Items VALUES ('{"a": 1}');
1 row inserted

gluesql> SELECT * FROM Items;
| a     |
|-------|
| [MAP] |
| 1     |

gluesql> SELECT a['x'] FROM Items limit 1;
| a['x'] |
|--------|
| 1      |

gluesql> SELECT a['x'] FROM Items;
[error] selector requires MAP or LIST types

We may need some function like get_type(columnName)?

SELECT 
CASE 
  WHEN get_type(a) == 'MAP' THEN a['x']
  ELSE a
END
FROM Items;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds nice to have a new function that can retrieve the type of value.
Then, how about TYPEOF..?

SELECT 
    CASE 
        WHEN TYPEOF(a) = 'MAP' THEN a['x']
        ELSE a
    END
FROM Items;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that!
TYPEOF sounds better 👍

.as_str());

test!(
"SELECT name, dex, rare FROM Item WHERE id = 100",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to filter with Map like this?

gluesql> SELECT * FROM Item WHERE obj = '{"cost": 3000}';
-- currently 0 rows

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PartialEq between Value::Map(_)s is implemented so it can be supported. we can consider using CAST for now.

SELECT * FROM Item WHERE obj = CAST('{ "cost": 3000 }' AS MAP);

Not supported yet, we need to add MAP cast to Value::try_cast_from_literal.

Copy link
Member

@ever0de ever0de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems good overall. It was an interesting piece of work! thank you!

test-suite/src/tester/macros.rs Outdated Show resolved Hide resolved
@panarch panarch merged commit cd77c56 into main Dec 25, 2022
@panarch panarch deleted the schemaless-schemaless branch December 25, 2022 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants