Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding files with missing tree to bad files #730

Merged
merged 7 commits into from
Oct 31, 2022

Conversation

MoAly98
Copy link
Contributor

@MoAly98 MoAly98 commented Oct 7, 2022

No description provided.

@lgray
Copy link
Collaborator

lgray commented Oct 7, 2022

@MoAly98 you need to fix the flake8 and black errors in this PR. :-)

@lgray
Copy link
Collaborator

lgray commented Oct 7, 2022

coffea/processor/executor.py:77:1: E302 expected 2 blank lines, found 1
coffea/processor/executor.py:1406:1: W293 blank line contains whitespace
coffea/processor/executor.py:1832:1: W293 blank line contains whitespace
coffea/processor/executor.py:1836:9: E[11](https://github.com/CoffeaTeam/coffea/actions/runs/3203441136/jobs/5240454273#step:4:12)5 expected an indented block (comment)
coffea/processor/executor.py:1836:9: E265 block comment should start with '# '
coffea/processor/executor.py:1842:1: W293 blank line contains whitespace

@MoAly98
Copy link
Contributor Author

MoAly98 commented Oct 7, 2022

hopefully this fixes them. Should I rebase as well since you released new coffea?

@lgray
Copy link
Collaborator

lgray commented Oct 7, 2022

You've got a few problems to fix. You can test them on your local computer using pytest:
https://coffeateam.github.io/coffea/installation.html#for-developers

@MoAly98
Copy link
Contributor Author

MoAly98 commented Oct 10, 2022

Weird to fail. I ran into the serviceX issue locally but I don't think I introduced it. I also had not ran into the others locally.

coffea/processor/executor.py Outdated Show resolved Hide resolved
@MoAly98 MoAly98 force-pushed the maly-skipbadfiles-missing-trees branch from eef9ff5 to 6762a25 Compare October 27, 2022 19:11
@lgray
Copy link
Collaborator

lgray commented Oct 27, 2022

@MoAly98 the workqueue tests fail in this case - can you please investigate? Thanks!

@MoAly98
Copy link
Contributor Author

MoAly98 commented Oct 27, 2022

@MoAly98 the workqueue tests fail in this case - can you please investigate? Thanks!

@MoAly98 the workqueue tests fail in this case - can you please investigate? Thanks!

urgh yeah i know what's wrong.. forgot we are dealing with generators here. I don't think its so expensive to check with next(chunks, None) is not None. Do you have any problems with that?

@lgray
Copy link
Collaborator

lgray commented Oct 27, 2022

Should be fine so long as it doesn't cause other regressions.

@MoAly98
Copy link
Contributor Author

MoAly98 commented Oct 27, 2022

local tests pass :)

@MoAly98
Copy link
Contributor Author

MoAly98 commented Oct 28, 2022

I preferred a lambda solution, but flake doesn't shut up about it ...

@lgray
Copy link
Collaborator

lgray commented Oct 29, 2022

can you force push this or something to make the CI turn over - it's weird it didn't catch it..

@lgray lgray changed the title Draft:: Adding files with missing tree to bad files Adding files with missing tree to bad files Oct 29, 2022
@lgray lgray merged commit 335963f into CoffeaTeam:master Oct 31, 2022
@lgray
Copy link
Collaborator

lgray commented Oct 31, 2022

@MoAly98 This is still broken for workqueue - please see the latest CI run. Please fix this.

@lgray
Copy link
Collaborator

lgray commented Oct 31, 2022

Perhaps also - @btovar as well for a second set of eyes.

@lgray
Copy link
Collaborator

lgray commented Oct 31, 2022

@MoAly98
Copy link
Contributor Author

MoAly98 commented Oct 31, 2022

I am actually not so sure. What is the work queue doing that is not being checked by the test suite? also how do I test the work-queue locally?

if wrapped_out is None:
def check_chunks(x):
if meta:
return next(x, None) is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a problem. The first element of the generator will be lost?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it is better to simply check the len() == 0 of chunks_to_count above (line 1811). It leaves chunks alone if it comes from a generator, and it has the same contents as chunks if not. Further, you may not need to define the function check_chunks anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats a much better idea. It also separates the lack of chunks from checking bad processors. Thanks! I implemented and pytest passes at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the first element is lost (which it should, you're right) the tests should fail ... makes me think the test is not covering something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/MoAly98/coffea/blob/391378089f3f30337205a293801040b09e606457/coffea/processor/executor.py#L1802

This is what's breaking. You can still get chunks_to_count/chunks as a generator. I guess a check_chunks functio there could make sense...

@btovar
Copy link
Contributor

btovar commented Oct 31, 2022

@lgray Did you intend this one to be merged?

@btovar
Copy link
Contributor

btovar commented Oct 31, 2022

I am actually not so sure. What is the work queue doing that is not being checked by the test suite? also how do I test the work-queue locally?

You should be able to cd tests and python wq.py.

@MoAly98
Copy link
Contributor Author

MoAly98 commented Oct 31, 2022

I am actually not so sure. What is the work queue doing that is not being checked by the test suite? also how do I test the work-queue locally?

You should be able to cd tests and python wq.py.

it is complaining that work_queue does not exist. I installed workqueue work-queue in the env but it still complained ...

@lgray
Copy link
Collaborator

lgray commented Oct 31, 2022

For some reason the CI broke in the PR and I couldn't get it to run again. If we can't fix this by thursday I'll revert the merge.

@btovar
Copy link
Contributor

btovar commented Oct 31, 2022

I am actually not so sure. What is the work queue doing that is not being checked by the test suite? also how do I test the work-queue locally?

You should be able to cd tests and python wq.py.

it is complaining that work_queue does not exist. I installed workqueue work-queue in the env but it still complained ...

How did you install work queue? Usually it comes from a conda install, like conda install -c conda-forge ndcctools. This assumes you are running inside a conda environment already, as in:

# Create a new environment
cd coffea
conda create --yes --name coffea-env -c conda-forge python  ndcctools  conda-pack
conda activate coffea-env
# install coffea in edit mode, so that changes get immediately available:
python -m pip install --ignore-installed .
cd tests
python wq.py

@MoAly98
Copy link
Contributor Author

MoAly98 commented Nov 1, 2022

@lgray @btovar I got rid of the chunks length check all together -- I think it was actually redundant and something I had introduced to avoid a test breaking but the problem was originally in the test itself so the line was not needed at all.

I have now updated the exectutor module, and I also changed the tests around so that they check for cases where a tree is present in one file but not the other, as well as when the tree is missing from both files. The tests all pass locally, and linting also passes. I also ran the WQ and it seems to work fine.

On that note though, I wonder why the WorkQueue test is not as extensive as the rest of the executors? Does running a coverage really show that all WorkQueue branchings are covered?

@btovar
Copy link
Contributor

btovar commented Nov 1, 2022

@MoAly98 I think it is because the wq executor needs to be setup with conda, so we have not got around to add it to the rest of the python tests.

@lgray
Copy link
Collaborator

lgray commented Nov 1, 2022

@MoAly98 can you submit a new pr with the changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants