Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle BOOLEAN data type in the backend #387

Closed
Tracked by #247
kgodey opened this issue Jul 15, 2021 · 14 comments · Fixed by #980
Closed
Tracked by #247

Handle BOOLEAN data type in the backend #387

kgodey opened this issue Jul 15, 2021 · 14 comments · Fixed by #980
Assignees
Labels
ready Ready for implementation type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL

Comments

@kgodey
Copy link
Contributor

kgodey commented Jul 15, 2021

This issue is to ensure that Mathesar can handle the Postgres BOOLEAN data type.

As part of this issue, we need to ensure that:

  • Users can use the API to change a column to BOOLEAN if it's possible to do so.
  • Automatic type inference during file import suggests BOOLEAN when it makes sense to do so.

Most of this is already implemented, this issue is to review the current implementation, ensure it works, and see if we can make improvements to it.

Additional Context

@kgodey kgodey added this to the 07. Initial Data Types milestone Jul 15, 2021
@kgodey kgodey added work: backend Related to Python, Django, and simple SQL ready Ready for implementation type: enhancement New feature or request work: database labels Jul 15, 2021
@mathemancer
Copy link
Contributor

mathemancer commented Jul 16, 2021

We should double-check that the strings we cast to boolean are what we want. We're currently restricting what gets cast from the PostgreSQL default to only 't', 'f', 'true', 'false' being castable to BOOLEAN.

@kgodey
Copy link
Contributor Author

kgodey commented Jul 16, 2021

It seems like we should at least accept the Postgres default values of true, t, yes. y, on, 1 and false, f, no, n, off, 0.

It would also be nice to ensure that any column that we're recommending to be BOOLEAN has only two values (plus empty values). e.g. we want to infer that a column is BOOLEAN if every value in the CSV is either yes or no, but we do not want to recommend it if they're mostly yes or no, but there's a couple of maybes in there.

@mathemancer
Copy link
Contributor

mathemancer commented Jul 19, 2021

It seems like we should at least accept the Postgres default values of true, t, yes. y, on, 1 and false, f, no, n, off, 0.

It would also be nice to ensure that any column that we're recommending to be BOOLEAN has only two values (plus empty values). e.g. we want to infer that a column is BOOLEAN if every value in the CSV is either yes or no, but we do not want to recommend it if they're mostly yes or no, but there's a couple of maybes in there.

At the moment, it's not possible to accomplish this combination of goals with our current inference functions, or the paradigm under which they work together. Please recall that we decided to stay away from trying to infer column type based on global attributes for the moment (such as "the column has only two non-null values"). We could change that, but it's a major overhaul of the entire inference system we currently have set up, and would be a large(ish) time investment.

Alternatively, we could hack together a "one-off" solution just for Booleans, but I'd rather come up with a type inference setup that's global-property-aware in general.

@kgodey
Copy link
Contributor Author

kgodey commented Jul 19, 2021

@mathemancer I remember now, I'll set up a separate issue for overhauling our type inference setup to be global-property-aware (probably post-MVP).

Edited to add: here's the new issue #438

@powellc
Copy link
Contributor

powellc commented Jul 20, 2021

@kgodey do you think #438 blocks this task? Or is the issue just not being able to check the whole incoming column for only boolean values? So this ticket would implement a "rudimentary" BOOLEAN data type that would only check per row?

@kgodey
Copy link
Contributor Author

kgodey commented Jul 20, 2021

@powellc #438 doesn't block this task, it's a future improvement. We should go ahead and implement the rudimentary version of the type. I'm trying to keep the scope for the MVP as small as feasible so we ship on time.

@kgodey
Copy link
Contributor Author

kgodey commented Jul 26, 2021

@powellc Checking in to see if you have any questions.

@powellc
Copy link
Contributor

powellc commented Jul 27, 2021

Nope. Got a lot of clarity on how the type system works from Brent. Will dig into this later today. Thanks @kgodey

@powellc
Copy link
Contributor

powellc commented Aug 6, 2021

Juist a quick update here. I finally got my head around the API design and basics of adding and updating things and have implemented the ability to alter a column type to Boolean. This utilizes some functions that already existsed so it was really just turning on the feature in the PATCH method for Tables.

The type_suggestions endpoint already provides a guess at the right column type if there are only boolean-like values in, so that AC should be good.

I'm going to add some tests for the column altering and open a PR today.

@kgodey
Copy link
Contributor Author

kgodey commented Aug 6, 2021

Sounds good, @powellc!

powellc added a commit that referenced this issue Aug 18, 2021
This commit adds the ability to use a PATCH command against the
`/tables/<id>/` endpoint with a payload of:

`{"columns": [{"name": "col_name", "type": "BOOLEAN"}]}`

which will alter the column type for `col_name` to Boolean if the values
in that row support a conversion to Boolean.
powellc added a commit that referenced this issue Aug 18, 2021
This commit adds the ability to use a PATCH command against the
`/tables/<id>/` endpoint with a payload of:

`{"columns": [{"name": "col_name", "type": "BOOLEAN"}]}`

which will alter the column type for `col_name` to Boolean if the values
in that row support a conversion to Boolean.
powellc added a commit that referenced this issue Aug 18, 2021
This commit adds the ability to use a PATCH command against the
`/tables/<id>/` endpoint with a payload of:

`{"columns": [{"name": "col_name", "type": "BOOLEAN"}]}`

which will alter the column type for `col_name` to Boolean if the values
in that row support a conversion to Boolean.
powellc added a commit that referenced this issue Aug 19, 2021
This commit adds the ability to use a PATCH command against the
`/tables/<id>/` endpoint with a payload of:

`{"columns": [{"name": "col_name", "type": "BOOLEAN"}]}`

which will alter the column type for `col_name` to Boolean if the values
in that row support a conversion to Boolean.
@powellc
Copy link
Contributor

powellc commented Aug 19, 2021

Moving this back to To do before talking with @kgodey tomorrow about PATCH verb on a columns endpoint. That's the only piece missing, as we already correctly infer Boolean column types.

@powellc
Copy link
Contributor

powellc commented Sep 1, 2021

@kgodey may I close this with your work work on the columns endpoint done? The inference of Boolean types works as expected and with the updating of column types from your work we should be good here.

@kgodey
Copy link
Contributor Author

kgodey commented Sep 1, 2021

@powellc I think it would be good to add some BOOLEAN specific tests for both inference and column alteration at the API layer. I think our tests are currently only at the db layer.

@powellc
Copy link
Contributor

powellc commented Sep 1, 2021

@kgodey sounds good! I'll work on that this evening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready Ready for implementation type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL
Projects
No open projects
3 participants