Improve `SqlTypeName` to support more types and also improve error handling #824

andygrove · 2022-09-29T20:34:51Z

Improve SqlTypeName::from_string() to support more types, using the sql parser to parse parameterized types such as VARCHAR(n) and DECIMAL(p, s)
Replaced some todo! and unimplemented! with Err and made corresponding changes in call sites

codecov-commenter · 2022-09-29T21:09:22Z

Codecov Report

Merging #824 (409a09b) into main (a7583b5) will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #824      +/-   ##
==========================================
- Coverage   74.88%   74.86%   -0.03%     
==========================================
  Files          71       71              
  Lines        3588     3588              
  Branches      748      748              
==========================================
- Hits         2687     2686       -1     
+ Misses        771      768       -3     
- Partials      130      134       +4

Impacted Files	Coverage Δ
dask_sql/_version.py	`32.27% <0.00%> (-0.29%)`	⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

randerzander · 2022-09-29T21:16:52Z

dask_planner/src/sql/types.rs

+            "BINARY" => Ok(SqlTypeName::BINARY),
+            "VARBINARY" => Ok(SqlTypeName::VARBINARY),
+            "CHAR" => Ok(SqlTypeName::CHAR),
+            "VARCHAR" | "STRING" => Ok(SqlTypeName::VARCHAR),


I've just tried these changes with my own tests.

I can now use "string" types, but varchars w/ a defined limit still fail:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/context.py", line 238, in create_table dc = InputUtil.to_dc( File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/convert.py", line 68, in to_dc table = filled_get_dask_dataframe(input_item) File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/convert.py", line 57, in <lambda> filled_get_dask_dataframe = lambda *args: cls._get_dask_dataframe( File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/convert.py", line 90, in _get_dask_dataframe return plugin.to_dc( File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/hive.py", line 69, in to_dc column_information = { File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/hive.py", line 70, in <dictcomp> col: sql_to_python_type(SqlTypeName.fromString(col_type.upper())) RuntimeError: Internal("Cannot determine SQL type name for 'VARCHAR(65535)'")

I have now pushed changes to support VARCHAR(n) and some other parameterized types

andygrove · 2022-09-29T21:41:51Z

cargo test is failing with:

  /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.1/src/types/any.rs:813: undefined reference to `PyObject_Str'
          collect2: error: ld returned 1 exit status

andygrove · 2022-09-30T14:58:01Z

cargo test fails on this PR but not on others, so something in this PR is causing a linker issue. Very odd. I am looking into it.

andygrove · 2022-09-30T15:11:43Z

TIL that you cannot reference PyMethods from Rust unit tests (at least, not without additional environment changes). Tests should now be passing.

support string sql type

fbd56d0

andygrove requested review from ayushdg, galipremsagar and jdye64 as code owners September 29, 2022 20:34

revert removing toolchain file

7a1ca5c

andygrove marked this pull request as draft September 29, 2022 20:53

tests

7cc5e30

randerzander reviewed Sep 29, 2022

View reviewed changes

error handling

5c1cc9f

andygrove changed the title ~~support string sql type~~ Improve SqlTypeName to support more types and also improve error handling Sep 29, 2022

andygrove marked this pull request as ready for review September 29, 2022 21:40

fix tests

409a09b

ayushdg approved these changes Sep 30, 2022

View reviewed changes

ayushdg merged commit bd6788d into dask-contrib:main Sep 30, 2022

andygrove deleted the string-type branch September 30, 2022 17:57

randerzander mentioned this pull request Nov 30, 2022

[DOC] RAPIDS 22.12 Release Blog Outline rapidsai/cudf#12057

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `SqlTypeName` to support more types and also improve error handling #824

Improve `SqlTypeName` to support more types and also improve error handling #824

andygrove commented Sep 29, 2022 •

edited

Loading

codecov-commenter commented Sep 29, 2022 •

edited

Loading

randerzander Sep 29, 2022

andygrove Sep 29, 2022

andygrove commented Sep 29, 2022

andygrove commented Sep 30, 2022

andygrove commented Sep 30, 2022

Improve SqlTypeName to support more types and also improve error handling #824

Improve SqlTypeName to support more types and also improve error handling #824

Conversation

andygrove commented Sep 29, 2022 • edited Loading

codecov-commenter commented Sep 29, 2022 • edited Loading

Codecov Report

randerzander Sep 29, 2022

Choose a reason for hiding this comment

andygrove Sep 29, 2022

Choose a reason for hiding this comment

andygrove commented Sep 29, 2022

andygrove commented Sep 30, 2022

andygrove commented Sep 30, 2022

Improve `SqlTypeName` to support more types and also improve error handling #824

Improve `SqlTypeName` to support more types and also improve error handling #824

andygrove commented Sep 29, 2022 •

edited

Loading

codecov-commenter commented Sep 29, 2022 •

edited

Loading