Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove .detect_column_types() from table, make it a documented API #81

Closed
simonw opened this issue Feb 1, 2020 · 4 comments
Closed
Labels
documentation enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Feb 1, 2020

I used it in geojson-to-sqlite here: https://github.com/simonw/geojson-to-sqlite/blob/f10e44264712dd59ae7dfa2e6fd5a904b682fb33/geojson_to_sqlite/utils.py#L45-L50

It would make more sense for this method to live on the Database rather than the Table - or even to exist as a separate utility method entirely.

Then it should be documented.

@simonw simonw added enhancement New feature or request documentation labels Feb 1, 2020
@simonw simonw changed the title Move .detect_column_types() from table to db, make it a documented API Move .detect_column_types() from table, make it a documented API Feb 1, 2020
@simonw
Copy link
Owner Author

simonw commented Feb 1, 2020

Here's the current method:

def detect_column_types(self, records):
all_column_types = {}
for record in records:
for key, value in record.items():
all_column_types.setdefault(key, set()).add(type(value))
column_types = {}
for key, types in all_column_types.items():
if len(types) == 1:
t = list(types)[0]
# But if it's list / tuple / dict, use str instead as we
# will be storing it as JSON in the table
if t in (list, tuple, dict):
t = str
elif {int, bool}.issuperset(types):
t = int
elif {int, float, bool}.issuperset(types):
t = float
elif {bytes, str}.issuperset(types):
t = bytes
else:
t = str
column_types[key] = t
return column_types

If I make it a utility function instead of a class method I could ensure it is directly importable like so:

from sqlite_utils import detect_column_types

@simonw
Copy link
Owner Author

simonw commented Feb 1, 2020

Should I keep table.detect_column_types() working so as not to break existing code?

If it was part of the documented API then I wouldn't break that without bumping to 3.x. Since it's undocumented I'm going to make it as a breaking change instead (and bump the geojson-to-sqlite dependency version).

@simonw
Copy link
Owner Author

simonw commented Feb 1, 2020

Actually I'll put it in the utils.py module.

@simonw
Copy link
Owner Author

simonw commented Feb 1, 2020

While I'm at it I think I'll rename it to suggest_column_types - it's not really detecting them since the input is just a list of dictionaries.

@simonw simonw changed the title Move .detect_column_types() from table, make it a documented API Remove .detect_column_types() from table, make it a documented API Feb 1, 2020
@simonw simonw closed this as completed in de76168 Feb 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant