-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solr indexing sometimes fails #27
Comments
@kltm have we seen anything like this for the owltools loader? |
Yes, things like it. |
Yes single blob. But it is not a consistent exception, it comes and goes.
I think that it came when we upgraded to solr6, I'll try to update solr first then.
…On Mon, Feb 27, 2017 at 6:42 AM, kltm ***@***.***> wrote:
Yes, things like it.
At first blush, I think it looks like a timeout before finish--either slow
over wire or too large. Are you still doing the single blob?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEHMGD0H_GnXyp2gzUQoDGAOgTQ7YIV9ks5rguDYgaJpZM4MNK5H>
.
|
You may be right at the limit then. Maybe try cutting out proxies or loading from local? It may be hard to figure it out without getting really good numbers on usage and time at each step. |
This sounds like a priority ticket - is it still happening? |
Yes, it still randomly happens. It failed, the generated json can still be indexed afterwards, it's a manual step. |
The PR here: #33 is merely duct tape to fix this. It looks like we've outgrown the current approach, the run requires 155g of memory, with 64g in cached mem, we end up swapping which likely causes these intermittent time out issues. Some thoughts on a way forward:
Pros: no additional development needed
|
On ordering the results - apparently you cannot apply an ORDER BY on a UNION of queries, I remember this was discussed when we initially developed the stack, and likely the reason for the huge mapDB approach: I can see if there have been any updates since that could serve as a workaround. |
@kshefchek Honestly, I think that 1b and 2 are the only real ways forward.
|
@kshefchek Practically speaking though, it seems like there is probably a fundamental non-scaling issue in the code. While techincally 1b might work, it is likely only deferring the issue a little while. At the very least, I'd vote that 2 should be done until the exact nature of the scaling issue is known. |
Neo4J now has procedures that can be used as a workaround for post union processing On a test query it seems to work just fine, I can test this out if it sounds like a good way forward. |
Taking a closer look, these timeouts occur when neo4j has a long GC pause, for example: Golr loader: Neo4J: I have a test branch that uses post union processing to order the results, which results in less memory usage but has longer GC pauses, so the timeouts are more frequent. Knowing now that neo4j is the issue perhaps we can play around with some tuning of memory and GC settings, but we may need to switch to a server with more memory as our graph grows. |
This is resolved now, the resolution was replacing UNION with UNION ALL and no longer having neo4j uniquify query results, where this is done in the indexing code anyway. The loader now has a much smaller memory footprint. |
Not sure why this is happening, especially with a local solr instance. It doesn't seem to index the file at all.
First thing to do is to fail the process, for now this is a silent exception. Then since the file is not indexed at all, a retry can be attempted, or at least keep the json file for a manual retry.
The text was updated successfully, but these errors were encountered: