Skip to content

Conversation

datapythonista
Copy link
Member

Checking how long every test file takes to run in the CI. In #26949 I'd like to run tests constraining that a whole file must be run in a single worker (so we can run single tests in parallel).

This means that files that will take a lot of time won't be parallelized well, and should be splitted for this to work. So, I want to know how long each takes.

@datapythonista datapythonista added Testing pandas testing functions or related to the test suite CI Continuous Integration labels Jun 20, 2019
@datapythonista
Copy link
Member Author

Checking how long it takes every test file to see if some should be splitted for better parallelization I see this, which doesn't look right to me:

pandas/tests/io/test_html.py : 21m59.041s
pandas/tests/io/json/test_pandas.py : 8m50.339s
pandas/tests/io/parser/test_common.py : 6m45.860s
pandas/tests/io/excel/test_readers.py : 13m14.864s

Didn't check the files individually, I guess in some cases we want to test loading big files, but 22 minutes to test the html I/O sounds very exaggerated.

@pandas-dev/pandas-core is this expected?

@jorisvandenbossche
Copy link
Member

Did you try running that single file locally?

@datapythonista
Copy link
Member Author

No, wanted to check first that this is not expected and I'm missing something. If this doesn't seem right, I'll then try to figure out what's going on.

@jorisvandenbossche
Copy link
Member

The second one takes locally 4.6 seconds (and not almost 9 min). There are 2 out of 97 tests skipped (encoding related), of which one is an s3 test

@jbrockmendel
Copy link
Member

pandas/tests/io/test_html.py

I'm seeing the two (not marked as slow) test_invalid_url tests as about 150s apiece.

@jbrockmendel
Copy link
Member

@datapythonista closable?

@datapythonista
Copy link
Member Author

Yep, let's close this for now, may need to reopen later to research on the timing problem, but I don't have time now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Continuous Integration Testing pandas testing functions or related to the test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants