-
Notifications
You must be signed in to change notification settings - Fork 72
Port Django management command from Kitsune and Kuma. #168
Conversation
Wow! I'm pretty psyched about this. I skimmed it and it looks good so far. I'm pretty hard up for free time, but after you land some tests and docs, I'll make a point of making time to look through it more carefully. |
refresh_interval = index_settings.get('index.refresh_interval', '1s') | ||
|
||
# Disable automatic refreshing | ||
es.update_settings(write_index, {'index': {'refresh_interval': '-1'}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another optimization could be to set num_replicas
to zero while re-indexing, which reduces copying data between nodes. Once you're done you can set it back to whatever it was and ES will bulk copy to the replication nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Will incorporate this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robhudson Do you happen to have a reference somewhere that this is supposed to be used like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to find docs on it but couldn't. If you have enough data locally you could maybe test it both ways and see which is faster. I forget where I learned of this... either the Elasticsearch training or maybe from Hanno while I was writing the reindexing jobs for marketplace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember hearing about this during ES training, so that's probably where you heard it too.
Also wrap index setting in a try/finally block to make sure we set it back even if we raise an exception in between.
…index with the given name.
@@ -95,6 +95,11 @@ file: | |||
multiple indexes, but we have no tests for that and I haven't | |||
tested it, either. | |||
|
|||
.. data:: ES_WRITE_INDEXES | |||
|
|||
Similar to :data:`ES_INDEXES` this specifies the indexes to write |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More specifically, it's a mapping of doctypes -> indexes to write to.
|
||
def __len__(self): | ||
return len(_model_cache) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad you cleaned this up. This whole thing was so icky. Thank you!
I'm really really excited you're taking this on. I'm really really sorry it took me a couple of weeks to get really work through this. I think there's a lot of good stuff in here. Clearly it needs some documentation changes and probably some tests, too. There are some API decisions in here that I want to play with in an example project (or two or three) before I know how I feel about them. I think it's worth doing another round of fixes in this PR and if that's good, then we'll land it. We have a couple of options:
Option 2 lets us do a 0.8.2 release if we need to. Plus it lets us tinker with things in a more leisurely pace. I think I'm inclined to go with that. @jezdez What's your timeline for this? Is this something you need to land asap or is this something we can take the next month or so to work on? |
@willkg My timeline is basically ASAP. I'm a bit tired of discussing basic Python library design though, so don't expect me to spend (or wait) weeks for this to land. I thought I could make a quick leap for the project but the feedback you've given me so far shows that that was a mistake. |
I'm just trying to help make sure the code is good. I didn't mean to piss you off. @robhudson Can you take over reviewing from here? It's clear I'm sucking at it. |
What's remaining to get the management command merged? I think have a out-of-the-box indexing management is a essential feature for django integration. ElasticUtils has been great with everything else (celery tasks, etc), but this was an obvious omission. |
After talking with @jezdez and @robhudson at PyCon, I'm going to close this out. We're going to go in a different direction. |
This is work in progress since I need to port over some tests, too ;)