Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MAINTENANCE] Remove some references to block style datasource #9868

Merged
merged 36 commits into from
May 3, 2024

Conversation

tyler-hoffman
Copy link
Contributor

@tyler-hoffman tyler-hoffman commented May 2, 2024

Removes all references to block style Datasource except for in new_datasource.py, abstract_data_context.py, and an __init__.py file. There are likely a near infinite number of tests that depend on the existence of the class, though.

  • Description of PR changes above includes a link to an existing GitHub issue
  • PR title is prefixed with one of: [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], [CONTRIB]
  • Code is linted - run invoke lint (uses ruff format + ruff check)
  • Appropriate tests and docs have been updated

For more information about contributing, see Contribute.

After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!

Copy link

netlify bot commented May 2, 2024

Deploy Preview for niobium-lead-7998 canceled.

Name Link
🔨 Latest commit 04c2d0b
🔍 Latest deploy log https://app.netlify.com/sites/niobium-lead-7998/deploys/6635458b5c8e810008717680

Copy link

codecov bot commented May 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.07%. Comparing base (bd52c5f) to head (04c2d0b).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9868      +/-   ##
===========================================
- Coverage    78.25%   78.07%   -0.18%     
===========================================
  Files          484      484              
  Lines        42394    42390       -4     
===========================================
- Hits         33174    33096      -78     
- Misses        9220     9294      +74     
Flag Coverage Δ
3.10 64.26% <ø> (-0.14%) ⬇️
3.11 64.26% <ø> (-0.14%) ⬇️
3.11 athena or clickhouse or openpyxl or pyarrow or project or sqlite or aws_creds 53.87% <ø> (-0.14%) ⬇️
3.11 aws_deps 44.77% <ø> (-0.18%) ⬇️
3.11 big 55.76% <ø> (+<0.01%) ⬆️
3.11 databricks 45.95% <ø> (-0.15%) ⬇️
3.11 filesystem 61.21% <ø> (-0.07%) ⬇️
3.11 mssql 48.77% <ø> (-0.21%) ⬇️
3.11 mysql 48.83% <ø> (-0.22%) ⬇️
3.11 postgresql 52.76% <ø> (-0.15%) ⬇️
3.11 snowflake 46.56% <ø> (-0.18%) ⬇️
3.11 spark 57.18% <ø> (-0.01%) ⬇️
3.11 trino 50.75% <ø> (-0.15%) ⬇️
3.8 64.28% <ø> (-0.13%) ⬇️
3.8 athena or clickhouse or openpyxl or pyarrow or project or sqlite or aws_creds 53.88% <ø> (-0.14%) ⬇️
3.8 aws_deps 44.78% <ø> (-0.18%) ⬇️
3.8 big 55.77% <ø> (+<0.01%) ⬆️
3.8 databricks 45.96% <ø> (-0.15%) ⬇️
3.8 filesystem 61.22% <ø> (-0.07%) ⬇️
3.8 mssql 48.75% <ø> (-0.21%) ⬇️
3.8 mysql 48.81% <ø> (-0.22%) ⬇️
3.8 postgresql 52.74% <ø> (-0.15%) ⬇️
3.8 snowflake 46.57% <ø> (-0.18%) ⬇️
3.8 spark 57.14% <ø> (-0.01%) ⬇️
3.8 trino 50.73% <ø> (-0.15%) ⬇️
3.9 64.28% <ø> (-0.14%) ⬇️
cloud 0.00% <ø> (ø)
docs-basic 49.10% <ø> (-0.01%) ⬇️
docs-creds-needed 50.23% <ø> (-0.01%) ⬇️
docs-spark 48.31% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tyler-hoffman tyler-hoffman changed the title M/v1 25/remove block style datasource [MAINTENANCE] Remove block style datasource May 2, 2024
@tyler-hoffman tyler-hoffman changed the title [MAINTENANCE] Remove block style datasource [MAINTENANCE] Remove some references to block style datasource May 2, 2024
@@ -913,28 +913,6 @@ def build_sa_validator_with_data( # noqa: C901, PLR0912, PLR0913, PLR0915
if context is None:
context = build_in_memory_runtime_context()

assert context is not None, 'Instance of any child of "AbstractDataContext" class is required.'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this stuff here? IDK, but things pass without it 🤷

@@ -226,246 +218,6 @@ def test_update_project_config(
assert context.progress_bars["globally"] is True


@pytest.mark.unit
def test_add_datasource_with_existing_datasource(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're going to remove datasource mgmt from context, so no use in reworking these tests.

},
},
)
# Updating "execution_engine" to insure peculiarities, incorporated herein, propagate to "ExecutionEngine" itself. # noqa: E501
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIP

Comment on lines -99 to -217

@pytest.mark.unit
def test_get_datasource_cache_miss(in_memory_runtime_context) -> None:
"""
What does this test and why?

For all non-Cloud contexts, we should leverage the underlying store in the case
of a cache miss.
"""
context = in_memory_runtime_context

name = "my_fake_datasource_name"

# Initial GET will miss the cache, necessitating store retrieval
with mock.patch(
"great_expectations.datasource.datasource_dict.DatasourceDict.__getitem__"
) as mock_get:
context.get_datasource(name)

assert mock_get.called

# Subsequent GET will retrieve from the cache
with mock.patch("great_expectations.data_context.store.DatasourceStore.get") as mock_get:
context.get_datasource(name)

assert not mock_get.called


@pytest.mark.unit
def test_BaseDataContext_add_datasource_updates_cache(
in_memory_runtime_context: EphemeralDataContext,
pandas_enabled_datasource_config: dict,
) -> None:
"""
What does this test and why?

For persistence-disabled contexts, we should only update the cache upon adding a
datasource.
"""
context = in_memory_runtime_context

name = pandas_enabled_datasource_config["name"]

assert name not in context.datasources

context.add_datasource(**pandas_enabled_datasource_config)

assert name in context.datasources


@pytest.mark.unit
def test_BaseDataContext_update_datasource_updates_existing_data_source(
in_memory_runtime_context: EphemeralDataContext,
pandas_enabled_datasource_config: dict,
) -> None:
"""
What does this test and why?

Updating a Data Source should update a Data Source
"""
context = in_memory_runtime_context

name = context.list_datasources()[0]["name"]
pandas_enabled_datasource_config["name"] = name
data_connectors = pandas_enabled_datasource_config["data_connectors"]
pandas_enabled_datasource_config.pop("class_name")
datasource = Datasource(**pandas_enabled_datasource_config)

assert name in context.datasources
cached_datasource = context.datasources[name]
assert cached_datasource.data_connectors.keys() != data_connectors.keys() # type: ignore[union-attr]

context.update_datasource(datasource)

retrieved_datasource = context.get_datasource(datasource_name=name)
assert retrieved_datasource.data_connectors.keys() == data_connectors.keys() # type: ignore[union-attr]


@pytest.mark.unit
def test_BaseDataContext_update_datasource_fails_when_datsource_does_not_exist(
in_memory_runtime_context: EphemeralDataContext,
pandas_enabled_datasource_config: dict,
) -> None:
"""
What does this test and why?

Updating a data source that does not exist should create a new data source.
"""
context = in_memory_runtime_context

name = pandas_enabled_datasource_config["name"]
pandas_enabled_datasource_config.pop("class_name")
datasource = Datasource(**pandas_enabled_datasource_config)

assert name not in context.datasources

with pytest.raises(DatasourceNotFoundError):
context.update_datasource(datasource)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirming that these are being removed because we don't support these methods on the data context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think there were 2 files that tested all this crud on context, but since we'll remove these methods in https://greatexpectations.atlassian.net/browse/V1-321, I don't think there's sense in taking the time to port them over to FDS

except AssertionError as e:
result = e
result = Validator(
execution_engine=PandasExecutionEngine(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting that this works. it's preferable to accessing a private attr on the datasource, but makes me wonder what our ideal pattern would be.

Comment on lines +715 to +717
# TODO: Convert this to actually mock an exception being thrown
# graph = ValidationGraph(execution_engine=execution_engine)
# graph.build_metric_dependency_graph = mock_error # type: ignore[method-assign]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this going to be follow up work? is it scoped somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@joshua-stauffer joshua-stauffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! 🦖

@tyler-hoffman tyler-hoffman added this pull request to the merge queue May 3, 2024
Merged via the queue into develop with commit 93ee4a2 May 3, 2024
70 checks passed
@tyler-hoffman tyler-hoffman deleted the m/v1-25/remove-block-style-datasource branch May 3, 2024 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants