Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: Get tests to run and fix them to pass #50636

Merged
merged 5 commits into from Jan 21, 2023
Merged

TST: Get tests to run and fix them to pass #50636

merged 5 commits into from Jan 21, 2023

Conversation

phershbe
Copy link
Contributor

@phershbe phershbe commented Jan 9, 2023

NOTE: test_metadata_propagation is still not fixed yet in this draft pull request

Changed the class name from Generic to TestGeneric in order to get the test to run and then fixed five groups of tests (test_rename, test_get_numeric_data, test_frame_or_series_compound_dtypes, test_metadata_propagation, test_api_compat) in order to make sure that all of the tests pass.

@phershbe
Copy link
Contributor Author

phershbe commented Jan 9, 2023

@MarcoGorelli Thank you for your patience, I'm learning a lot. The work is explained in the description above. There are five groups of tests that weren't passing and four of them are fixed.

test_metadata_propagation on Line 174 still isn't fixed. I understand the basic idea, that we are assigning name variables to sets of data and then passing those sets of data through operations and making sure that the metadata is still the same. I know that assert_metadata_equivalent always needs to compare two sets of data, and sometimes here the second set is not added yet, but I can't figure out yet what to add. I'm sure that I can get it with some more time, but I've taken a long time already, so feel free to do it if you want.

Also, the reset_index(drop=True) on Line 96 works to get the test using a Series in test_get_numeric_data to pass, but it seems like there might be some cleaner way.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @phershbe !

For the metadata test, here's how I'd approach it:

  1. tm.assert_metadata_equivalent(result) doesn't run, so let's check which PR last touched it:
$ git log -G 'tm.assert_metadata_equivalent\(result\)'
commit 3570efd439ae49a7800f63a6247f089fa462405a
Author: Matthew Roeschke <>
Date:   Tue Dec 28 12:32:31 2021 -0800

    TST: More pytest idioms in tests/generic (#45086)
  1. let's look at TST: More pytest idioms in tests/generic #45086 - we see that
    def check_metadata(self, x, y=None):
        for m in x._metadata:
            v = getattr(x, m, None)
            if y is None:
                assert v is None
            else:
                assert v == getattr(y, m, None)

became

def assert_metadata_equivalent(left, right):
    """
    Check that ._metadata attributes are equivalent.
    """
    for attr in left._metadata:
        val = getattr(left, attr, None)
        if right is None:
            assert val is None
        else:
            assert val == getattr(right, attr, None)
  1. Looks like there was a little difference between the functions, but it wasn't picked up because the tests weren't running - can you spot it?

  2. finally, it would be good to add type annotations to this function, whilst we're changing it. I think just DataFrame | Series should work

@phershbe
Copy link
Contributor Author

@MarcoGorelli Whoa, you're the best, thank you for the guidance. The advice to find which PR last touched the code and then investigate it should be very helpful for me in the future, thank you. I updated the code and the tests all pass now!

I didn't add anything to the doc/source/whatsnew/vX.X.X.rst because I thought that maybe this is too small of an issue, but I can if you think it's a good idea. Let me know.

Also, after my initial commit, I got some e-mails saying that various checks in CI were failing, is that normal?

Of course, feel free to let me know if there is anything else that needs to be changed.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

yeah you can see the failing check here: https://github.com/pandas-dev/pandas/actions/runs/3880306936/jobs/6618255248

can you also remove the exclude from

exclude: |
(?x)
^pandas/tests/generic/test_generic.py # GH50380

please?

pandas/_testing/asserters.py Outdated Show resolved Hide resolved
@phershbe
Copy link
Contributor Author

@MarcoGorelli Great, thank you for the reminder about reading about the failed check. On a similar note, there is already a failed check here, but it's something about Ubuntu that is failing on a lot of other pull requests from other people I noticed.

I'm trying to read about the meaning of the | and (?x) in exclude in YAML in order to better understand what I'm doing, but I went ahead and deleted all three lines because it seems like that's what you were asking for.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @phershbe - can you mark as ready for review please? should be good to go

for exclude, check the pre-commit docs

@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Jan 10, 2023
@MarcoGorelli MarcoGorelli added the Testing pandas testing functions or related to the test suite label Jan 10, 2023
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, blocking as there's something I hadn't picked up on, which we should check

Comment on lines 91 to 96
result = o._get_bool_data()
expected = construct(n, value="empty", **kwargs)
expected = construct(frame_or_series, n, value="empty", **kwargs)
if isinstance(o, DataFrame):
# preserve columns dtype
expected.columns = o.columns[:0]
tm.assert_equal(result, expected)
tm.assert_equal(result.reset_index(drop=True), expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@topper-123 is this right?

In [1]: ser = Series([1,2,3])

In [2]: ser.index
Out[2]: RangeIndex(start=0, stop=3, step=1)

In [3]: ser._get_bool_data()
Out[3]: Series([], dtype: int64)

In [4]: ser._get_bool_data().index
Out[4]: Index([], dtype='object')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phershbe OK for now could you please just add a comment with

# https://github.com/pandas-dev/pandas/issues/50862

above this line? then we can get this in and address that separately if needed

@phershbe
Copy link
Contributor Author

phershbe commented Jan 11, 2023

thanks @phershbe - can you mark as ready for review please? should be good to go

for exclude, check the pre-commit docs

@MarcoGorelli I'll wait for the response on the part that's blocking in the comment above before marking as ready because maybe that will inform another reviewer that it's ready I guess, but I'll check back in a few hours at the end of the night and again in the morning and mark it ready when you let me know so that it can get finished since it has taken me so long.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left a comment

If you fetch and merge upstream/main and mark as ready for review then we can get this in

Comment on lines 91 to 96
result = o._get_bool_data()
expected = construct(n, value="empty", **kwargs)
expected = construct(frame_or_series, n, value="empty", **kwargs)
if isinstance(o, DataFrame):
# preserve columns dtype
expected.columns = o.columns[:0]
tm.assert_equal(result, expected)
tm.assert_equal(result.reset_index(drop=True), expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phershbe OK for now could you please just add a comment with

# https://github.com/pandas-dev/pandas/issues/50862

above this line? then we can get this in and address that separately if needed

@phershbe
Copy link
Contributor Author

@MarcoGorelli Thank you, I'm on it, doing it now

@phershbe phershbe marked this pull request as ready for review January 19, 2023 17:50
@phershbe
Copy link
Contributor Author

@MarcoGorelli Okay, I think everything should be ready now. I was a little bit confused because your message here in the conversation was a comment about a comment in the files changed tab about adding a comment to the code ... it's clear enough though, it's my inexperience. I added the comment above Line 91, I didn't know if it would be more appropriate there or above Line 97, so feel free to move it if necessary.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @phershbe !

@MarcoGorelli MarcoGorelli merged commit 56c1b20 into pandas-dev:main Jan 21, 2023
pooja-subramaniam pushed a commit to pooja-subramaniam/pandas that referenced this pull request Jan 25, 2023
* updated tests

* updated assert_metadata_equivalent

* updated assert_metadata_equivalent and deleted exclude in YAML check-test-naming

* added comment

* move comment

Co-authored-by: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST 56 tests in pandas/tests/test_generic.py aren't being tested
2 participants