-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disk Error When Attempting to Run Tests #286
Comments
@pgarz Thanks for bringing this up, we're aware of this problem and will ship a new release with a fix to this in the next 12 hours, let ping you when its done! |
Hey @pgarz , as promised, it's fixed in the latest release 0.20.19. By the way, come join us on discord, we're talking there and today someone brought up the exact issue :) https://discord.com/invite/a3K9c8GRGt |
wonderful, thanks! |
I have the same problem when running both |
@julfr AttributeError: 'NoneType' object has no attribute 'add_llm_test_case' ? |
I am sorry, the error is: AttributeError: 'NoneType' object has no attribute 'test_cases' |
Can you provide some code for me to reproduce?
…On Tue, Mar 12, 2024 at 6:01 PM julfr ***@***.***> wrote:
@julfr <https://github.com/julfr> AttributeError: 'NoneType' object has
no attribute 'add_llm_test_case' ?
I am sorry, the error is: AttributeError: 'NoneType' object has no
attribute 'test_cases'
—
Reply to this email directly, view it on GitHub
<#286 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BCFQK6Z46HUDKWD7ZQ6EW53YX3G6VAVCNFSM6AAAAAA7LSKZJ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJRGIZTOOJZGQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Sure, i am just trying to reproduce the example in my jupyter notebook:
Here' no problem and I sees the output:
But when I try
The same with
Out:
|
+1 I'm also facing the As a workaround I've made this snippet https://gist.github.com/Peilun-Li/a0f26847812e177383a3dc7f17b3d84b which runs bulk evaluation in full and customizable paralelleism, and does not rely on file sync given we support async mode now. Seems to be working through my preliminary test. Feel free to use and adapt. |
hey @julfr I missed your message. May I ask does this only happen when running |
The error on my end can be surfaced using this following simple snippet
Running it would end up with error Digging a bit deeper it does feel like a file system related issue? When running the above code in mac/linux with normal local disk, there's not issue running the above snippet. The error seems only happen when a NFS disk mount (in our case, AWS EFS) is used. So looking into the code which throws out the error:
So the potential fix (at least for the one on my end), can be to use a different lock flag when attempting to read the file only at https://github.com/confident-ai/deepeval/blob/v0.21.00/deepeval/test_run/test_run.py#L187-L189 ? Or we can update the mode used there to be read and write (w+)? A deeper question though can be if this file sync pattern is sustainable and if there are potential cleaner ways to maintain everything in memory and only output to disk as end steps if needed. |
Thanks @Peilun-Li for the detailed debugging, I was wondering does this work for the with open(self.temp_file_name, "r") as file:
portalocker.lock(file, portalocker.LOCK_SH, timeout=5)
self.test_run = self.test_run.load(file) Here we are using portalocker but manually managing the opening of files. Since I don't have the environment you're working in, would it be possible if you try it out and see what happens? Thank you so much! |
(as for the file sync pattern, we're doing it because it is probably the easiest way we can achieve multiprocessing, which does not have shared memory.) |
this gives an error of The following one seems to be working from my end (I have to add a
Yeah agree this file pattern makes sense when the model calls can only be made in a sync pattern (meanwhile also may be due to the side-effect of statefulnes of metric objects. We can reserve the discussion for the other thread :) ). Also curious given async model & metrics are supported right now we might could make a shortcut (as in my above bulk evaluation snippet) if async mode is enabled? |
@Peilun-Li So async mode just runs metrics concurrently for each test case, and not for all test cases at once. For all test cases at once we still rely on multiprocessing for now. By the way I'm going to bring this to discord, I'm losing track of the issues on Github... |
Experiencing the same issue on a HPC with NFS disk. |
hey @repetitioestmaterstudiorum just to confirm are you calling |
Hey @penguine-ip no it's happening only with the |
@repetitioestmaterstudiorum Ok i see, this is weird because they use the same methods under the hood (both writing to disk). Do you have a locally forked version of deepeval, or are you using the version from pypi (the one from pip install/poetry install)? |
I am using the unchanged version 0.21.21 on the HPC.
|
@repetitioestmaterstudiorum I see. well this is definitely an anti-pattern. A simple experiment you can be doing is try not running a for loop when doing assert_test, do you get the same disk error? |
Probably (regarding the anti-pattern). There's no benefit for me in using pytest & |
I take that back. Today I ran the same script using |
Hey @Peilun-Li @repetitioestmaterstudiorum, can you please try the evaluate function again on the latest release (v0.21.25)? I added the share lock to read operations as @Peilun-Li described. Not causing any additional errors so far but I'd like to see if it fixes the problems on your systems, thanks! |
This worked for me, thank you for the update @penguine-ip ! |
No problem @repetitioestmaterstudiorum , all thanks to @Peilun-Li for leading the solution to this 🙏🏻 |
Works from my end now as well! Thanks for fixing it! |
@Peilun-Li Not at all, all thanks to you! Stateless metric UX will be done early next month I think. |
Hey @Peilun-Li and @repetitioestmaterstudiorum, I fixed some bugs with some race conditions and abstracted away the logic here: https://github.com/confident-ai/deepeval/blob/main/deepeval/test_run/test_run.py#L318 I don't think it should cause any new issues with your file systems, but just in case please let me know if you run into any problem on the new update. The update with this new piece of code is v0.21.30, thanks ! |
Hi @penguine-ip, it works for me! |
However, I'm noticing now that I'm getting this occasional warning:
|
@repetitioestmaterstudiorum I tried playing around with it and in the previous example @Peilun-Li put out (by using a NB lock), it hides the error. However, since it is now in "r+" mode we can't use a NB lock. So we have to either 1) remove timeout, or 2) ignore it. Does it print once, or does it spam your terminal console in a deepeval test run? @Peilun-Li Do you have any suggestions on this fix? It is happening here: https://github.com/confident-ai/deepeval/blob/main/deepeval/test_run/test_run.py#L325 |
It only prints once, so it's not an annoyance. I was thinking ... when executing scripts that read and write to files in a HPC environment, it's anyway good practice to execute such scripts in directories with physical (not NFS) drives. So e.g. when I train models that rely on data loaders (loading images, etc.), I always move such data to the local, physical drive and ensure my script reads and writes to this faster drive. What's special in this case is that I don't actually have a data loader setup or any setup where I intentionally read and write files locally, it's just a library that holds state. |
Would it be possible to set the lock to be Alternatively yeah we can omit the timeout here given it's warned to be ineffective anyways. Plus based on the logic of portalocker ( https://github.com/wolph/portalocker/blob/v2.8.2/portalocker/utils.py#L215-L221 ) it will set to the same 5s timeout even if we do not provide timeout explicitly, and we could bypass the warning there. |
Describe the bug
I get a disk error when trying to run a test. Might be a bug in how some temp file is being cached by the DeepEval library
To Reproduce
My stack trace is as follows
The text was updated successfully, but these errors were encountered: