Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-40809][SPARK-40780][FOLLOW-UP] Improve filter and alias testing coverage in python client #38278

Closed
wants to merge 5 commits into from

Conversation

amaliujia
Copy link
Contributor

@amaliujia amaliujia commented Oct 17, 2022

What changes were proposed in this pull request?

  1. Refactor test_connect_plan_only.py and move common code to base class.
  2. Test generated proto plan for filter and alias.

Why are the changes needed?

Improve python client testing coverage.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT

@amaliujia
Copy link
Contributor Author

R: @HyukjinKwon @zhengruifeng

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise.

return DataFrame.withPlan(Read(table_name), cls.connect) # type: ignore

@classmethod
def _udf_mock(cls, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
def _udf_mock(cls, *args, **kwargs):
def _udf_mock(cls, *args, **kwargs) -> str:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I thought mypy checks such case but it seems not to be true on this line.

@@ -47,6 +49,9 @@ def __init__(self) -> None:
def set_hook(self, name: str, hook: Any) -> None:
self.hooks[name] = hook

def drop_hook(self, name: str) -> None:
self.hooks.pop(name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we catch the exceptions if self.hooks do not contains name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for our testing and we have good control over it (e.g. make sure we only clean up things that registered for the testing) so I think it is ok.

@classmethod
def _udf_mock(cls, *args, **kwargs):
return "internal_name"

@classmethod
def setUpClass(cls: Any) -> None:
cls.connect = MockRemoteSession()
cls.tbl_name = f"tbl{uuid.uuid4()}".replace("-", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't actually use uuid here but call DROP IF EXISTS before and after running the tests. Since this PR doesn't target to address these, I am fine with doing separately too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes.

I will follow up. Python side testing framework needs more refinement for sure.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@amaliujia
Copy link
Contributor Author

amaliujia commented Oct 17, 2022

wired that seeing

Run PYTHON_EXECUTABLE=python3.9 ./dev/lint-python
starting python compilation test...
python compilation succeeded.

starting black test...
black checks passed.

starting flake8 test...
flake8 checks passed.

starting mypy annotations test...
annotations failed mypy checks:
python/pyspark/sql/connect/plan.py:325: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:327: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:331: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:342: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:346: error: unused "type: ignore" comment
python/pyspark/sql/connect/plan.py:348: error: unused "type: ignore" comment
Found 6 errors in 1 file (checked 366 source files)
1

@amaliujia
Copy link
Contributor Author

thinking my local mypy is different...

@HyukjinKwon
Copy link
Member

Merged to master.

SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…g coverage in python client

### What changes were proposed in this pull request?

1. Refactor `test_connect_plan_only.py` and move common code to base class.
2. Test generated proto plan for `filter` and `alias`.

### Why are the changes needed?

Improve python client testing coverage.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes apache#38278 from amaliujia/test_where_as_in_python.

Authored-by: Rui Wang <rui.wang@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants