Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SqlTypeName to support more types and also improve error handling #824

Merged
merged 5 commits into from
Sep 30, 2022

Conversation

andygrove
Copy link
Contributor

@andygrove andygrove commented Sep 29, 2022

  • Improve SqlTypeName::from_string() to support more types, using the sql parser to parse parameterized types such as VARCHAR(n) and DECIMAL(p, s)
  • Replaced some todo! and unimplemented! with Err and made corresponding changes in call sites

@andygrove andygrove marked this pull request as draft September 29, 2022 20:53
@codecov-commenter
Copy link

codecov-commenter commented Sep 29, 2022

Codecov Report

Merging #824 (409a09b) into main (a7583b5) will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #824      +/-   ##
==========================================
- Coverage   74.88%   74.86%   -0.03%     
==========================================
  Files          71       71              
  Lines        3588     3588              
  Branches      748      748              
==========================================
- Hits         2687     2686       -1     
+ Misses        771      768       -3     
- Partials      130      134       +4     
Impacted Files Coverage Δ
dask_sql/_version.py 32.27% <0.00%> (-0.29%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

"BINARY" => Ok(SqlTypeName::BINARY),
"VARBINARY" => Ok(SqlTypeName::VARBINARY),
"CHAR" => Ok(SqlTypeName::CHAR),
"VARCHAR" | "STRING" => Ok(SqlTypeName::VARCHAR),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just tried these changes with my own tests.

I can now use "string" types, but varchars w/ a defined limit still fail:

Traceback (most recent call last):               
  File "<stdin>", line 1, in <module>                                                                            
  File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/context.py", line 238, in create_table
    dc = InputUtil.to_dc(                                                                                                   
  File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/convert.py", line 68, in to_dc
    table = filled_get_dask_dataframe(input_item)                                                           
  File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/convert.py", line 57, in <lambda>
    filled_get_dask_dataframe = lambda *args: cls._get_dask_dataframe(                                                      
  File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/convert.py", line 90, in _get_dask_dataframe
    return plugin.to_dc(                                                                               
  File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/hive.py", line 69, in to_dc
    column_information = {                                                                                    
  File "/opt/conda/envs/rapids/lib/python3.9/site-packages/dask_sql/input_utils/hive.py", line 70, in <dictcomp>
    col: sql_to_python_type(SqlTypeName.fromString(col_type.upper()))                                      
RuntimeError: Internal("Cannot determine SQL type name for 'VARCHAR(65535)'")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now pushed changes to support VARCHAR(n) and some other parameterized types

@andygrove andygrove changed the title support string sql type Improve SqlTypeName to support more types and also improve error handling Sep 29, 2022
@andygrove andygrove marked this pull request as ready for review September 29, 2022 21:40
@andygrove
Copy link
Contributor Author

cargo test is failing with:

  /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.1/src/types/any.rs:813: undefined reference to `PyObject_Str'
          collect2: error: ld returned 1 exit status

@andygrove
Copy link
Contributor Author

cargo test fails on this PR but not on others, so something in this PR is causing a linker issue. Very odd. I am looking into it.

@andygrove
Copy link
Contributor Author

TIL that you cannot reference PyMethods from Rust unit tests (at least, not without additional environment changes). Tests should now be passing.

@ayushdg ayushdg merged commit bd6788d into dask-contrib:main Sep 30, 2022
@andygrove andygrove deleted the string-type branch September 30, 2022 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants