Whole search fails if one node can't connect #3

jure · 2014-06-22T18:14:47Z

Related to this issue: tsujio/webrtc-chord#5

I'm storing keywords and documents in the DHT, so if you search for "cancer software", it will first retrieve the key "cancer", then the key "software" from the network. These keys will contain document id arrays, e.g.:

cancer: ["10.1039/cancer.research.1", "10.1039/cancer.research.2"]
software: ["10.1039/cancer.research.1", "10.1039/cancer.research.2", "10.4000/cancer.research.3"]

It then performs an intersection of these keys, and retrieves the documents from the DHT network. So for the above example, roughly:

intersection = ["10.1039/cancer.research.1", "10.1039/cancer.research.2"]
_.each(intersection, function (docId) { chord.retrieve(docId) })

So for this search, 4 requests are made to the DHT: cancer, research, 10.1039/cancer.research.1 and 10.1039/cancer.research.2 (in effect, there are more requests still, because each keyword gets queried for all fields of a document, so title, abstract, authors, journal, etc., in the form of "[fieldname]keyword" keys. With 5 fields per document, that's 10 requests for just two keywords, and then 2 more to get the actual documents.

If any of these fails, the whole search fails. I cache the document id lookups, as these are static, but even so, the failure rate for searches is quite high.

I guess document lookup could happen from dx.doi.org as a fallback, in case the ID is a DOI (not in all cases), but even so, there should be a way to make this more resilient, either by smartly partially failing or contacting replicas for keywords where main node can't be reached.

The text was updated successfully, but these errors were encountered:

jure · 2014-06-29T15:12:36Z

Some progress has been made on this, with the addition of ignoring documents, if they are not found: ae883ae#diff-b880c77d0f382525de5100984f260cebR336

This results in a query sometimes saying: '16 results found' and only displaying 4, because the other 12 could not be found on the network.

It's a start.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whole search fails if one node can't connect #3

Whole search fails if one node can't connect #3

jure commented Jun 22, 2014

jure commented Jun 29, 2014

Whole search fails if one node can't connect #3

Whole search fails if one node can't connect #3

Comments

jure commented Jun 22, 2014

jure commented Jun 29, 2014