Update sql-using-python-udf.py by Ritinikhil · Pull Request #1 · timsaucer/datafusion-python

Ritinikhil · 2025-03-06T03:08:42Z

enhance sql-using-python-udf example

Add comprehensive comments and documentation
Implement multiple data registration methods for API compatibility
Add version information printing for debugging
Improve error handling with informative messages
Add formatted table output for better readability
Include input validation through PyArrow schema
Add results verification with assertions

Author: Ritinikhil
Date: 2025-03-06

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

enhance sql-using-python-udf example - Add comprehensive comments and documentation - Implement multiple data registration methods for API compatibility - Add version information printing for debugging - Improve error handling with informative messages - Add formatted table output for better readability - Include input validation through PyArrow schema - Add results verification with assertions Author: Ritinikhil Date: 2025-03-06

improve path handling in substrait example - Add cross-platform path handling using os.path - Add error handling for CSV file registration - Improve code documentation - Remove hard-coded Windows paths - Keep Apache license header intact Author: Ritinikhil

timsaucer · 2025-03-09T12:53:43Z

I just noticed this - can you change your upstream to the main repo rather than my fork?

- Wrap CASE/WHEN method-chain examples in parentheses and assign to a variable so they are valid Python as shown (Copilot #1, #2). - Fix INTERSECT/EXCEPT mapping: the default distinct=False corresponds to INTERSECT ALL / EXCEPT ALL, not the distinct forms. Updated both the Set Operations section and the SQL reference table to show both the ALL and distinct variants (Copilot apache#4). - Change write_parquet / write_csv / write_json examples to file-style paths (output.parquet, etc.) to match the convention used in existing tests and examples. Note that a directory path is also valid for partitioned output (Copilot apache#5). Verified INTERSECT/EXCEPT semantics with a script: df1.intersect(df2) -> [1, 1, 2] (= INTERSECT ALL) df1.intersect(df2, distinct=True) -> [1, 2] (= INTERSECT) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Ritinikhil added 2 commits March 6, 2025 08:37

Ritinikhil closed this by deleting the head repository Mar 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update sql-using-python-udf.py#1

Update sql-using-python-udf.py#1
Ritinikhil wants to merge 2 commits intotimsaucer:mainfrom
Ritinikhil:main

Ritinikhil commented Mar 6, 2025

Uh oh!

timsaucer commented Mar 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ritinikhil commented Mar 6, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

timsaucer commented Mar 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants