Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map more Python types to SQLite types #84

Merged
merged 2 commits into from
Dec 13, 2023
Merged

Conversation

ahuang11
Copy link
Contributor

@ahuang11 ahuang11 commented Dec 7, 2023

Fixes #75 but likely in an unsatisfactory way.

I combed through the pandas library trying to find the optimal solution for converting python/pandas/numpy dtypes into corresponding SQL types--each with its own trade-offs.

In the end, since I don't knowing too much of HoloNote's internals, specifically whether I can update SpecItem, or how much of HoloNote I should change (can we completely remove SQLiteDB(Connector) from connector.py and use SQLAlchemy?), I chose the most compatible path forward.

However, I list my notes below:

# 1 doesn't touch internal method, but needs to parse the CREATE TABLE string to get schema, not sure where to inject inside HoloNote
# Outputs: `CREATE TABLE "test" (\n"index" INTEGER,\n  "a" INTEGER,\n  "b" REAL,\n  "c" TEXT,\n  "d"...``
table = pd.io.sql.SQLiteTable("test", None, frame=df)
table.sql_schema()
# 2 touches internal method, and not sure where to inject inside HoloNote
# however, for sqlalchemy, table._sql_type_name is _sqlalchemy_type
table = pd.io.sql.SQLiteTable("test", None, frame=df)
print(table._get_column_names_and_types(table._sql_type_name))
# 3 still need to use mapper, and not sure where to inject inside HoloNote
for col in df.columns:
    try:
        print(col, type(df[col][0]), pd.api.types.pandas_dtype(type(df[col][0])))
    except Exception as e:
        print(col, type(df[col][0]), "FAILED")
        continue
# 4 still needs to use mapper and not sure where to inject inside HoloNote
for col in df.columns:
    print(pd.api.types.infer_dtype(df[col]))
# 5 include SQLAlchemy dependency
# 6 include pyarrow dependency

I was also wondering why we need to have the connector SQLiteDB.create_table define the types inside? Why not have df.to_sql() figure it out?

Also, do we need a Connector class or can we depend on SQLAlchemy's classes?

@ahuang11 ahuang11 changed the title Add types Map more Python types to SQLite types Dec 7, 2023
Copy link

codspeed-hq bot commented Dec 7, 2023

CodSpeed Performance Report

Merging #84 will improve performances by 14.2%

Comparing ahuang11:add_types (8f78c0d) with main (8d8132d)

Summary

⚡ 6 improvements

Benchmarks breakdown

Benchmark main ahuang11:add_types Change
test_commit[100] 1.2 s 1 s +15.89%
test_define_annotations[10] 90.1 ms 78.9 ms +14.2%
test_commit[1000] 13.4 s 10.4 s +29.67%
test_define_annotations[1000] 561.8 ms 469.4 ms +19.7%
test_commit[10] 205.6 ms 177.8 ms +15.6%
test_define_annotations[100] 132.1 ms 109.6 ms +20.52%

@ahuang11
Copy link
Contributor Author

Thanks for fixing the tests!

@hoxbro
Copy link
Member

hoxbro commented Dec 13, 2023

I think the suggested change is good and will merge it.

can we completely remove SQLiteDB(Connector) from connector.py and use SQLAlchemy

I don't think we can ever completely remove the need for a Connector, but it should be as thin as possible layer to the database. I see the class the same way as pd.io.sql.SQLiteTable, not handling any of the communication but giving a simple interface for us to send and get data from a database.

We want to add an SQLAlchemy connector see #65.

@hoxbro hoxbro merged commit f90adb3 into holoviz:main Dec 13, 2023
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kdims KeyError: <class 'numpy.float32'>
2 participants