Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while looping over complete collection #32

Closed
akki opened this issue Jan 16, 2017 · 6 comments
Closed

Error while looping over complete collection #32

akki opened this issue Jan 16, 2017 · 6 comments
Labels

Comments

@akki
Copy link

akki commented Jan 16, 2017

Hi

I need to loop over all the documents of a collection of mine but I am unable to do that as I get errors. Please help me on how I can achieve this.

These are the two scenarios that I have tried -

>>> coll = acl.database('db_name').collection('coll_name')
>>> for i in coll:
...   print i
...   break
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/arango/collections/base.py", line 51, in __iter__
    raise DocumentGetError(res)
arango.exceptions.DocumentGetError: [HTTP 501][ERR 1470] '/_api/export' is not yet supported in a cluster
>>>

and

>>> for i in coll.find({}):
...   print i
...   break
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/arango/api.py", line 22, in wrapped_method
    return conn.handle_request(request, handler)
  File "/usr/local/lib/python2.7/dist-packages/arango/connection.py", line 165, in handle_request
    return handler(getattr(self, request.method)(**request.kwargs))
  File "/usr/local/lib/python2.7/dist-packages/arango/connection.py", line 233, in put
    auth=(self._username, self._password)
  File "/usr/local/lib/python2.7/dist-packages/arango/http_clients/default.py", line 105, in put
    verify=self._check_cert
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 546, in put
    return self.request('PUT', url, data=data, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 609, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 473, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",))
>>>

The second scenario takes some time before throwing the error. The collection has around 26M documents if that matters.

@joowani
Copy link
Contributor

joowani commented Jan 17, 2017

Hi @akki,

The first issue definitely looks like something I need to address in the next release. For now, you can use for i in db.aql.execute('FOR doc IN coll_name RETURN doc') as a workaround.

As for the second issue, what version of ArangoDB are you using? Looking at the number of documents involved this could be related to the heavy load issue in #30. Please try upgrading ArangoDB to the latest version, your requests[security], openssl, and rebuilding your virtualenv if you use one. Let me know if that makes a difference.

@joowani joowani added the bug label Jan 17, 2017
@Slater-Victoroff
Copy link

Slater-Victoroff commented Jan 31, 2017

Similar error, but without an explicit error.

I have a collection with just about 1m documents and about 2-3gb, so very small. When I try to iterate through the collection the code just hangs and my CPU usage shoots up so high that I can't use anything else on my computer. Running on a Macbook Pro with 8gb of memory and an i5 CPU, so really shouldn't be a problem.

client = ArangoClient()
db = client.db('news')
articles = db.collection('Articles')

for article in articles:
    print("Iteration start")  # Never gets to this line

The recommended fix also doesn't make any difference:

for article in db.aql.execute('FOR doc IN Articles RETURN doc'):
    print("Iteration start")  # Still never gets here

@joowani
Copy link
Contributor

joowani commented Feb 1, 2017

Hi @Slater-Victoroff

Hmm. I will try to reproduce this myself. How did you install ArangoDB (e.g. docker)?
Also, are you able to execute the AQL query I gave you on arangosh?

@akki
Copy link
Author

akki commented Feb 2, 2017

@joowani Sorry, somehow I missed your earlier comment.
I was using a 3.0 cluster and using python-arango 3.4.1
Currently, we do not plan to upgrade our database/packages versions any soon but I'll keep you updated about anything I find that might be helpful for this issue's investigation.
Thanks for following up, really appreciated!

@joowani
Copy link
Contributor

joowani commented Feb 5, 2017

Hi @akki,

your first issue (the one with error [HTTP 501][ERR 1470] '/_api/export' is not yet supported in a cluster) should be fixed in the latest release python-arango 3.5.0. Please try it out and let me know if you have any problems. As for the other issue I am still investigating. Thanks.

@joowani
Copy link
Contributor

joowani commented Feb 5, 2017

It seems the second problem on this issue is the duplicate of #30. As the first problem is resolved I will be merging this ticket to #30.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants