Skip to content

Conversation

@ericvergnaud
Copy link
Contributor

There is a (Python?) bug where calling dataclasses.fields(some_data_class) sometimes returns fields where type is a type name instead of a type (for example 'str' instead of <class str>). This prevents our ORM from working as expected.
The issue can be reproduced in databrickslabs/ucx#2401, and is currently fixed using a local workaround see https://github.com/databrickslabs/ucx/pull/2401/files#diff-40a8f3f4a3a5e5b222ccbfb95204001a6b007c78cbe55b9f1a53af5a8309886eR349
This PR works around the issue globally.

@github-actions
Copy link

github-actions bot commented Aug 27, 2024

✅ 34/34 passed, 3 flaky, 3 skipped, 39m53s total

Flaky tests:

  • 🤪 test_dashboards_creates_dashboard_from_query_with_cte (7.394s)
  • 🤪 test_dashboards_creates_dashboard_with_widget_title_and_description (8.234s)
  • 🤪 test_dashboards_creates_dashboard_with_filters (7.552s)

Running from acceptance #362

Copy link
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nfx nfx merged commit 42d4cce into main Aug 29, 2024
@nfx nfx deleted the fix-dataclass-field-types branch August 29, 2024 21:07
nfx added a commit that referenced this pull request Aug 30, 2024
* Fixed dataclass field types ([#257](#257)). This PR introduces a workaround to a Python bug affecting the `dataclasses.fields()` function, which sometimes returns field types as string type names instead of types. This can cause the ORM to malfunction. The workaround involves checking if the returned `f.type` is a string, and if so, converting it to a type by looking it up in the `__builtins__` dictionary. This change is global and affects the `_schema_for` function in the `backends.py` file, which is responsible for creating a schema for a given dataclass, taking into account any necessary type conversions. This change ensures consistent and accurate type handling in the face of the Python bug, improving the reliability of our ORM.
* Fixed missing EOL when formatting SQL files ([#260](#260)). In this release, we have addressed an issue related to the inconsistent addition of end-of-line (EOL) characters in formatted SQL files. The `QueryTile.format()` method has been updated to ensure that an EOL character is always added, except when the input query already ends with a newline. This change enhances the reliability of the SQL formatting functionality, making the output format more predictable and improving the overall user experience. The new implementation is demonstrated in the `test_query_format_preserves_eol()` test case, and existing test cases have been updated to check for the presence of EOL characters, further ensuring consistent and correct formatting.
* Fixed normalize case input in cli ([#258](#258)). In this release, we have updated the `fmt` command in the `cli.py` file to allow users to specify whether they want to normalize the case of SQL files when formatting. The `normalize_case` parameter now defaults to the string `"true"` and checks if it is in the `STRING_AFFIRMATIVES` list to determine whether to normalize the case of SQL files. Additionally, we have introduced a new optional `normalize_case` parameter in the `format` method of the `dashboards.py` file in the Databricks CLI, which normalizes the identifiers in the query to lower case when set to `True`. We have also added support for a new `normalize_case` parameter in the `QueryTile.format()` method, which prevents the automatic normalization of string input to uppercase when set to `False`. This change allows for more flexibility in handling string input and ensures that the input string is preserved as-is. These updates improve the functionality and usability of the open-source library, providing more control to users over formatting and handling of string input.
@nfx nfx mentioned this pull request Aug 30, 2024
nfx added a commit that referenced this pull request Aug 30, 2024
* Fixed dataclass field types
([#257](#257)). This PR
introduces a workaround to a Python bug affecting the
`dataclasses.fields()` function, which sometimes returns field types as
string type names instead of types. This can cause the ORM to
malfunction. The workaround involves checking if the returned `f.type`
is a string, and if so, converting it to a type by looking it up in the
`__builtins__` dictionary. This change is global and affects the
`_schema_for` function in the `backends.py` file, which is responsible
for creating a schema for a given dataclass, taking into account any
necessary type conversions. This change ensures consistent and accurate
type handling in the face of the Python bug, improving the reliability
of our ORM.
* Fixed missing EOL when formatting SQL files
([#260](#260)). In this
release, we have addressed an issue related to the inconsistent addition
of end-of-line (EOL) characters in formatted SQL files. The
`QueryTile.format()` method has been updated to ensure that an EOL
character is always added, except when the input query already ends with
a newline. This change enhances the reliability of the SQL formatting
functionality, making the output format more predictable and improving
the overall user experience. The new implementation is demonstrated in
the `test_query_format_preserves_eol()` test case, and existing test
cases have been updated to check for the presence of EOL characters,
further ensuring consistent and correct formatting.
* Fixed normalize case input in cli
([#258](#258)). In this
release, we have updated the `fmt` command in the `cli.py` file to allow
users to specify whether they want to normalize the case of SQL files
when formatting. The `normalize_case` parameter now defaults to the
string `"true"` and checks if it is in the `STRING_AFFIRMATIVES` list to
determine whether to normalize the case of SQL files. Additionally, we
have introduced a new optional `normalize_case` parameter in the
`format` method of the `dashboards.py` file in the Databricks CLI, which
normalizes the identifiers in the query to lower case when set to
`True`. We have also added support for a new `normalize_case` parameter
in the `QueryTile.format()` method, which prevents the automatic
normalization of string input to uppercase when set to `False`. This
change allows for more flexibility in handling string input and ensures
that the input string is preserved as-is. These updates improve the
functionality and usability of the open-source library, providing more
control to users over formatting and handling of string input.
nfx pushed a commit that referenced this pull request Sep 4, 2024
In complement to #257 , dataclass field type needs to be adjusted in
other places.
This fix is required by databrickslabs/ucx#2526

Co-authored-by: Eric Vergnaud <eric.vergnaud@databricks.com>
nfx added a commit that referenced this pull request Sep 4, 2024
* Added documentation for exclude flag ([#265](#265)). A new `exclude` flag has been added to the configuration file for our lab tool, allowing users to specify a path to exclude from formatting during lab execution. This release also includes corrections to grammatical errors in the descriptions of existing flags related to catalog and database settings, such as updating `seperated` to "separate". Additionally, the flag descriptions for `publish` and `open-browser` have been updated for clarification: `publish` now clearly controls whether the dashboard is published after creation, while `open-browser` controls whether the dashboard is opened in a web browser. These changes are aimed at improving user experience and ease of use for our lab tool.
* Fixed dataclass field type in _row_to_sql ([#266](#266)). In this release, we have addressed an issue related to [#257](#257) by fixing the dataclass field type in the `_row_to_sql` method of the `backends.py` file. Additionally, we have made updates to the `_schema_for` method to use a new `_field_type` class method. This change resolves a rare problem where the `field.type` is a string instead of a type and ensures compatibility with a pull request from an external repository (<databrickslabs/ucx#2526>). The new `_field_type` method attempts to load the type from `__builtins__` if it's a string and logs a warning if it fails. The `_row_to_sql` method now consistently uses the `_field_type` method to get the field type. This ensures that the library functions seamlessly and consistently, avoiding any potential issues in the future.
nfx added a commit that referenced this pull request Sep 4, 2024
* Added documentation for exclude flag
([#265](#265)). A new
`exclude` flag has been added to the configuration file for our lab
tool, allowing users to specify a path to exclude from formatting during
lab execution. This release also includes corrections to grammatical
errors in the descriptions of existing flags related to catalog and
database settings, such as updating `seperated` to "separate".
Additionally, the flag descriptions for `publish` and `open-browser`
have been updated for clarification: `publish` now clearly controls
whether the dashboard is published after creation, while `open-browser`
controls whether the dashboard is opened in a web browser. These changes
are aimed at improving user experience and ease of use for our lab tool.
* Fixed dataclass field type in _row_to_sql
([#266](#266)). In this
release, we have addressed an issue related to
[#257](#257) by fixing the
dataclass field type in the `_row_to_sql` method of the `backends.py`
file. Additionally, we have made updates to the `_schema_for` method to
use a new `_field_type` class method. This change resolves a rare
problem where the `field.type` is a string instead of a type and ensures
compatibility with a pull request from an external repository
(<databrickslabs/ucx#2526>). The new
`_field_type` method attempts to load the type from `__builtins__` if
it's a string and logs a warning if it fails. The `_row_to_sql` method
now consistently uses the `_field_type` method to get the field type.
This ensures that the library functions seamlessly and consistently,
avoiding any potential issues in the future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants