-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-4002] [py] Improve testing infraestructure #2063
Conversation
you are correct, Verify2 should be modified. there are 2 test scripts since most operations are running at the same time, eating up a lot of memory due to the memory-mapped files. putting them all in one can lead to OOM errors. |
Ok, I will modify it and commit the corrected version. |
I have corrected Verify2(), but there is another case that is not checked, when the resulting datasets have more elements than expected, right now the error will go unnoticed. I also wanted to ask, is there a way to execute only the python tests? Since I want to unify the utilities in a file, but without knowing what is the execution path, I cannot make sure if the module will be imported correctly. |
when we have more data than expected, remove() will be called on an empty list and should throw an exception, no? if you want to execute the python tests you only have to call mvn verify on the flink-python package.
|
Sorry, I said it wrong, it's the opposite. The case that fails in Verify() and Verify2(), is when we have more values in expected than in the resulting DataSet. |
you're right, that falls through. we should add additional checks:
|
I have fixed the last errors in the test functions. But, while trying to refactor the utility code, that now is repeated in both test files, I think I found another bug. The thing is that, in order to be able to have another
Right now, if HDFS is not used, our case with |
test files are never copied to the tmp folder regardless of whether HDFS is used or not. This is not a bug, but intended behaviour. You can easily add the utils.py to the test invokation by modifying the call to the PythonPlanBinder.main in PythonPlanBinderTest. it should be as simple as appending "/utils.py" to the call. |
Yes, I know that how python scripts are executed for the test is different. Let me elaborate: Since running the tests is quite costly in my laptop, I normally test my changes executing them in a local instance of Flink 1.0.3, since this is less taxing. Once I complete the changes, I run |
I can confirm that there is a bug in Should I open another issue, or just fix it here? This is not only a bug related to the testing infrastructure, but the Python API in general. I can provide a small example, where the code fails. |
you can add it to this PR as a separate commit. |
looking at the code right now; i may have figured out why the files aren't copied, but i find it odd that it supposedly works with hdfs. it actually should never copy additional files if no parameters are given. |
Yes, I had the same thought looking at the code, I could not figure out why it worked with HDFS... I will try to fix it tomorrow. |
I think everything should be ready now. I was not able to pinpoint why HDFS worked, I assume |
Is everything ok? Should I look into something else? |
Looks good, I'll merge it later on. |
Thanks! |
The Verify() test function now does not error out when array elements are missing:
I have also documented test functions.
While documenting, two questions arise:
First, Verify2 function has no use as is, performing a
if value in self.expected:
before:Makes this function useless, since it will never raise and exception, if I am not mistaken.
Second, I am not sure why there are two test scripts,
main_test.py
andmain_test2.py
.