Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEST-#6994: Update tests in test_series.py #6995

Merged
merged 4 commits into from
Mar 5, 2024

Conversation

anmyachev
Copy link
Collaborator

@anmyachev anmyachev commented Mar 4, 2024

What do these changes do?

Obsolete parameter combinations were found in #6954.

  • first commit message and PR title follow format outlined here

    NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.

  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves Update tests in test_series.py #6994
  • tests added and passing
  • module layout described at docs/development/architecture.rst is up-to-date

)


@pytest.mark.parametrize("data", test_data_values, ids=test_data_keys)
@pytest.mark.parametrize("func", agg_func_values, ids=agg_func_keys)
def test_aggregate_numeric(request, data, func):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In individual tests that have the word numeric in their names, there is no meaning, since they test a subset of parameters that are already fully tested.

eval_general(
modin_series,
pandas_series,
lambda series: user_warning_checker(
series, fn=lambda series: series.aggregate("cumproduct")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cumproduct is deprecated in favour of cumprod.

@pytest.mark.parametrize(
"skipna", bool_arg_values, ids=arg_keys("skipna", bool_arg_keys)
)
@pytest.mark.parametrize("skipna", [False, True])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done mainly to avoid testing the None value, since it no longer makes sense (pandas has changed the default value), or pandas throws an exception.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we not be testing this case? Do we throw the same exception?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values are no longer within the range of acceptable values, and the test coverage has not changed, therefore we do not have specific code for such a case and we can quite freely remove this test case.

I also believe that tests for receiving the same errors for situations where values of an invalid type are passed are not the first priority. So for now I just deleted it.

@pytest.mark.parametrize("axis", [None, 0, 1])
@pytest.mark.parametrize("level", [None, -1, 0, 1])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This argument is no longer used in functions.

@@ -1842,7 +1724,9 @@ def test_dt(timezone):
def dt_with_empty_partition(lib):
# For context, see https://github.com/modin-project/modin/issues/5112
df = (
pd.concat([pd.DataFrame([None]), pd.DataFrame([pd.TimeDelta(1)])], axis=1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TimeDelta no longer exists.

@@ -3570,8 +3421,12 @@ def test_tolist(data):


@pytest.mark.parametrize("data", test_data_values, ids=test_data_keys)
@pytest.mark.parametrize("func", agg_func_values, ids=agg_func_keys)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most functions are not classified as transformational; there is little point in testing exceptions in most cases here.

@@ -4076,7 +3929,7 @@ def test_str_removesuffix(data):


@pytest.mark.parametrize("data", test_string_data_values, ids=test_string_data_keys)
@pytest.mark.parametrize("width", int_arg_values, ids=int_arg_keys)
@pytest.mark.parametrize("width", [-1, 0, 5])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are an excessive number of parameterization options; we can reduce their number without changing the coverage.

@@ -4143,7 +3996,7 @@ def test_str_wrap(data, width):
@pytest.mark.parametrize("data", test_string_data_values, ids=test_string_data_keys)
@pytest.mark.parametrize("start", int_arg_values, ids=int_arg_keys)
@pytest.mark.parametrize("stop", int_arg_values, ids=int_arg_keys)
@pytest.mark.parametrize("step", int_arg_values, ids=int_arg_keys)
@pytest.mark.parametrize("step", [-2, 1, 3])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are an excessive number of parameterization options; we can reduce their number without changing the coverage.

@@ -4358,8 +4211,11 @@ def test_str_rfind(data, sub, start, end):

@pytest.mark.parametrize("data", test_string_data_values, ids=test_string_data_keys)
@pytest.mark.parametrize("sub", string_sep_values, ids=string_sep_keys)
@pytest.mark.parametrize("start", int_arg_values, ids=int_arg_keys)
@pytest.mark.parametrize("end", int_arg_values, ids=int_arg_keys)
@pytest.mark.parametrize(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are an excessive number of parameterization options; we can reduce their number without changing the coverage.

@@ -4430,14 +4289,6 @@ def test_str_translate(data, pat):
modin_series, pandas_series, lambda series: series.str.translate(table)
)

# Test delete chars
deletechars = "|"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This argument no longer exists.

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
@@ -3384,22 +3244,16 @@ def test_subtract(data):
test_data_values + test_data_small_values,
ids=test_data_keys + test_data_small_keys,
)
@pytest.mark.parametrize("axis", axis_values, ids=axis_keys)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense for the series. I can write a separate test just for this case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test added.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I think we don't even need a test for this😄

Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
if func in ("all", "kurt"):
func_kwargs["axis"] = axis
elif not axis:
# FIXME: In these cases, Modin does not throw exceptions like pandas does
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create an issue for this. I am also wondering how it has been passing before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#7000.

I am also wondering how it has been passing before?

level parameter was used there, and pandas and modin failed with the error that this parameter did not exist. That is, the irrelevant exception was hiding the problem.

modin/pandas/test/test_series.py Outdated Show resolved Hide resolved
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
skipna=skipna,
numeric_only=numeric_only,
min_count=min_count,
)


@pytest.mark.parametrize("operation", ["sum", "shift"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove? I don't think we need to test axis=1 for series since it doesn't make sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave it for now, it seems to me that there is still some inconsistency in the processing of this parameter in pandas. Let's make sure we're aligned here with pandas.

@YarShev YarShev merged commit 21d10ea into modin-project:master Mar 5, 2024
37 checks passed
@anmyachev anmyachev deleted the issue6994 branch March 5, 2024 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update tests in test_series.py
2 participants