New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite existing file issue #388
Comments
which operating system are you using? |
Windows 10 |
ok, i think the problem is that windows doesn't let you write to open files. |
@erikbern hi, can you clarify your point? I have the same issue while loading and building a model. I always see this OSError. First, I thought it's something related to Win paths' specifics. But I tried all the potential options with absolute/relative paths, slashes/backslashes, etc. No luck. Env: Win10, python 3.7, miniconda Also, note that file's size is 7Gb. And the same code perfectly works on MacOS. |
interesting. i've heard people report various issues with file sizes above a few gigabytes. let me see if i can add a test case this weekend and fix |
@erikbern just noticed the following line in logs, when such error occurs: |
interesting, will take a look |
@sskorol do you see any error message? there should have been an exception raised back to Python if lseek fails. See https://github.com/spotify/annoy/blob/master/src/annoylib.h#L1049 |
@erikbern just the same stack as above:
And the path is 100% valid as I'm reaching annoy loader:
|
i think the "invalid argument" probably refers to the offset argument to just double checking but you said this is on windows right? there's probably some 32/64 bit issue with large files |
@erikbern yes, Win 10 x64. |
@erikbern hi, did you have a chance to take a look at it? |
haven't had a time to look at it. feel free to look at it yourself. should be easy to write a failing unit test for files > 2GB |
@erikbern a simple test that fails: def should_load_big_file_after_save(self):
f = 250
t = 10
m = AnnoyIndex(f, 'angular')
m.verbose(True)
for i in range(2000000):
v = [random.gauss(0, 1) for z in range(f)]
m.add_item(i, v)
m.build(t)
m.save('test_big.annoy')
self.assertEquals(m.get_n_trees(), t) Just a note: a file itself is successfully saved (~2Gb+). The problem appears on autoloading after saving: lseek fails with the same error. Is there any instruction on how to build a C++ code in |
Thanks, that confirms what I suspected. I'll make it into a failing unit test for now. Will be interesting to see on what platforms it fails. |
Should be fixed now! |
This should be fixed in latest master:
#442
I'll push a new version to PyPI in a couple of weeks. It would be great if
you can test the latest version in master and see if it fixes your issue
…On Tue, Dec 17, 2019 at 3:39 PM Sergey Korol ***@***.***> wrote:
@erikbern <https://github.com/erikbern> a simple test that fails:
def should_load_big_file_after_save(self):
f = 250
t = 10
m = AnnoyIndex(f, 'angular')
m.verbose(True)
for i in range(2000000):
v = [random.gauss(0, 1) for z in range(f)]
m.add_item(i, v)
m.build(trees)
m.save('test_big.annoy')
self.assertEquals(m.get_n_trees(), t)
Just a note: a file itself is successfully saved (~2Gb+).
The problem appears on autoloading after saving: *lseek* fails with the
same error.
I found this thread
<https://stackoverflow.com/questions/20051948/equivalent-of-lseek-on-linux-in-windows-api/20052055#20052055>
on SO. I'm not a C++ Guru to confirm or reject it. However, I can try to
take a look at it.
Is there any instruction on how to build a C++ code in /src?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#388?email_source=notifications&email_token=AAH27CYLXHLPVW5GDOMEDTTQZE2H3A5CNFSM4HNXD7J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHD4E4I#issuecomment-566739569>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAH27CZ3CWNWSHXIVIFJPZLQZE2H3ANCNFSM4HNXD7JQ>
.
|
@erikbern thanks, will check it tomorrow |
@erikbern checked it... |
np. was a fairly easy fix. i'll push a new version to pypi before the end of the year |
fyi this is on pypi now |
Hello! I'm using annoy 1.15.2 on python 3.7.3. Here is the issue:
Could you tell why I get the error?
The text was updated successfully, but these errors were encountered: