New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task warning bypass #370

Merged
merged 2 commits into from Apr 13, 2013

Conversation

Projects
None yet
@mkrisher
Contributor

mkrisher commented Feb 27, 2013

Someone asked to add a silence option for automated environments in a comment here:

7cfe688#commitcomment-2703943

This pull request addresses this issue.

mkrisher added some commits Feb 27, 2013

Merge branch 'master' of git://github.com/sunspot/sunspot
* 'master' of git://github.com/sunspot/sunspot:
  Sunspot 2.0.0.pre is now 2.0.0
  Update the Rakefile to run the tests for the current versions of Rails that we're testing against.
  add license information to gemspec
  Bump sqlite3 from 1.3.1 to 1.3.7
  Adds note to rake task informing user that it is destructive
Adds silence option to argument array, and better matching to answer
This commit adds a silence option for users that have automated this
task and want to skip the destructive prompt. It also changes the
matching of a user's answer to explicitally check for 'y'.

mkrisher referenced this pull request Feb 27, 2013

Adds note to rake task informing user that it is destructive
The title of the task is a little misleading. The assumption could be
made that the task will just update the data via a reindex, when in fact
it drops the current indexes completely and starts over. This can be
very destuctive if you have a large data set which can takes weeks to
reconstruct.
@chourobin

This comment has been minimized.

Show comment
Hide comment
@chourobin

chourobin commented Mar 6, 2013

+1

@thimios

This comment has been minimized.

Show comment
Hide comment
@thimios

thimios commented Mar 22, 2013

+1

@felixbuenemann

This comment has been minimized.

Show comment
Hide comment
@felixbuenemann

felixbuenemann Mar 24, 2013

Contributor

Also adding my comments here, because commit comments are easily overlooked:

I agree that there should be a feature to disable the question, but I think there are more elegant ways to achieve it than what @mkrisher suggested. Either check if there's a tty via STDOUT.tty?or depend on a rake parameter like --silent or --quiet which is exposed via Rake.application.options.

Contributor

felixbuenemann commented Mar 24, 2013

Also adding my comments here, because commit comments are easily overlooked:

I agree that there should be a feature to disable the question, but I think there are more elegant ways to achieve it than what @mkrisher suggested. Either check if there's a tty via STDOUT.tty?or depend on a rake parameter like --silent or --quiet which is exposed via Rake.application.options.

@nz

This comment has been minimized.

Show comment
Hide comment
@nz

nz Mar 26, 2013

Member

I agree with all of the above. And I might even go a bit further and characterize the reindex warning as overly harsh and intimidating. So I'd be open to redesigning the reindexing process a bit to try to reduce the need for that warning to be shown in the first place.

One thought here is to make the behavior of purging of existing documents optional. You should usually only need to purge the entire index if your application regularly fails to delete documents from Solr as ActiveRecord objects are deleted. Another slightly rare case is if you change your schema.xml such that you have fields which are now undefined in Solr, which can cause some occasional issues.

Those two aside, there isn't really a benefit to always clearing the index every time you reindex.

Another thought: rather than an interactive confirmation, how about a passive timer that gives you the ability to cancel if you aren't sure whether you want to proceed? I sometimes find myself blocking destructive operations in my own code with something like this:

puts "Hey, you're about to destroy something! Are you super sure you want to do that?"
print "Press ^C to cancel. Otherwise, proceeding in "
5.downto(0).each { |i| pring "#{i}... "; sleep(1) }
puts

For an output like this…

Hey, you're about to destroy something! Are you super sure you want to do that?
Press ^C to cancel. Otherwise, proceeding in 5... 4... 3... 2... 1... 0...
Destroyed your stuff. Proceeding to do some other thing...

The other use case for a warning is that reindexing may take a really long time. We don't need forceful consent for that, we just want to set expectations. So the timer to interrupt could work for that.

TLDR

My ideas:

  1. Make the destruction of documents optional, to be explicitly confirmed with some kind of "force" parameter or environment variable.
  2. Change mode of interactivity to a ^C interrupt, with a bit of a timer to read and react. No TTY required for consent.
Member

nz commented Mar 26, 2013

I agree with all of the above. And I might even go a bit further and characterize the reindex warning as overly harsh and intimidating. So I'd be open to redesigning the reindexing process a bit to try to reduce the need for that warning to be shown in the first place.

One thought here is to make the behavior of purging of existing documents optional. You should usually only need to purge the entire index if your application regularly fails to delete documents from Solr as ActiveRecord objects are deleted. Another slightly rare case is if you change your schema.xml such that you have fields which are now undefined in Solr, which can cause some occasional issues.

Those two aside, there isn't really a benefit to always clearing the index every time you reindex.

Another thought: rather than an interactive confirmation, how about a passive timer that gives you the ability to cancel if you aren't sure whether you want to proceed? I sometimes find myself blocking destructive operations in my own code with something like this:

puts "Hey, you're about to destroy something! Are you super sure you want to do that?"
print "Press ^C to cancel. Otherwise, proceeding in "
5.downto(0).each { |i| pring "#{i}... "; sleep(1) }
puts

For an output like this…

Hey, you're about to destroy something! Are you super sure you want to do that?
Press ^C to cancel. Otherwise, proceeding in 5... 4... 3... 2... 1... 0...
Destroyed your stuff. Proceeding to do some other thing...

The other use case for a warning is that reindexing may take a really long time. We don't need forceful consent for that, we just want to set expectations. So the timer to interrupt could work for that.

TLDR

My ideas:

  1. Make the destruction of documents optional, to be explicitly confirmed with some kind of "force" parameter or environment variable.
  2. Change mode of interactivity to a ^C interrupt, with a bit of a timer to read and react. No TTY required for consent.
@marksweston

This comment has been minimized.

Show comment
Hide comment
@marksweston

marksweston Mar 27, 2013

I came here thinking I would be making my first pull request to Sunspot, but instead I'll just +1 this one.

I don't have a problem with the warning - it makes sense to warn people. But our development workflow sometimes involves multiple reindexes in an hour, and our production workflow requires at least one a week - something we were just trying to automate when the warning started appearing.

This is the minimal option I'd planned on implementing. My vote is merge this in now and and discuss how to improve the warning afterwards. It's pretty inconvenient as is.

@nz - I think the warning as currently written is actually fine - it serves both purposes you mention and gives users a chance to avoid an expensive mistake. I suspect the timed countdown would be less effective at that; I know that when I run rake sunspot:reindex I invariably tab away to another task while I wait for it to finish - I might not have even noticed that message! And once you go for that style of warning you end up balancing having to allow enough time for a newer user to read, understand and act on the warning against making everyone wait the same length of time whether they need the warning or not.

marksweston commented Mar 27, 2013

I came here thinking I would be making my first pull request to Sunspot, but instead I'll just +1 this one.

I don't have a problem with the warning - it makes sense to warn people. But our development workflow sometimes involves multiple reindexes in an hour, and our production workflow requires at least one a week - something we were just trying to automate when the warning started appearing.

This is the minimal option I'd planned on implementing. My vote is merge this in now and and discuss how to improve the warning afterwards. It's pretty inconvenient as is.

@nz - I think the warning as currently written is actually fine - it serves both purposes you mention and gives users a chance to avoid an expensive mistake. I suspect the timed countdown would be less effective at that; I know that when I run rake sunspot:reindex I invariably tab away to another task while I wait for it to finish - I might not have even noticed that message! And once you go for that style of warning you end up balancing having to allow enough time for a newer user to read, understand and act on the warning against making everyone wait the same length of time whether they need the warning or not.

@gabceb

This comment has been minimized.

Show comment
Hide comment
@gabceb

gabceb Apr 12, 2013

👍 on making this happen. I am using a Capistrano task to reindex and it fails because there is no way to get past the reindex prompt.
image

gabceb commented Apr 12, 2013

👍 on making this happen. I am using a Capistrano task to reindex and it fails because there is no way to get past the reindex prompt.
image

nz added a commit that referenced this pull request Apr 13, 2013

Merge pull request #370 from mkrisher/task_warning_bypass
Bypass the interactive warning when reindexing.

@nz nz merged commit 96b6815 into sunspot:master Apr 13, 2013

1 check passed

default The Travis build passed
Details
@nz

This comment has been minimized.

Show comment
Hide comment
@nz

nz Apr 13, 2013

Member

Merged, thanks! Really it was @gabceb's memeage that sold me.

Member

nz commented Apr 13, 2013

Merged, thanks! Really it was @gabceb's memeage that sold me.

@gabceb

This comment has been minimized.

Show comment
Hide comment
@gabceb

gabceb Apr 13, 2013

😊 Thanks @nz!

gabceb commented Apr 13, 2013

😊 Thanks @nz!

@nz

This comment has been minimized.

Show comment
Hide comment
@nz

nz Apr 13, 2013

Member

Worth a 2.1 release?

Member

nz commented Apr 13, 2013

Worth a 2.1 release?

@gabceb

This comment has been minimized.

Show comment
Hide comment
@gabceb

gabceb Apr 13, 2013

That would be awesome

gabceb commented Apr 13, 2013

That would be awesome

@marksweston

This comment has been minimized.

Show comment
Hide comment
@marksweston

marksweston Apr 13, 2013

Thanks @nz - and +1 to a new release

marksweston commented Apr 13, 2013

Thanks @nz - and +1 to a new release

@felixbuenemann

This comment has been minimized.

Show comment
Hide comment
@felixbuenemann

felixbuenemann Apr 13, 2013

Contributor

@gabceb btw. you could always use yes | rake sunspot:reindex to automatically answer the question.

Contributor

felixbuenemann commented Apr 13, 2013

@gabceb btw. you could always use yes | rake sunspot:reindex to automatically answer the question.

@leppert

This comment has been minimized.

Show comment
Hide comment
@leppert

leppert Jul 2, 2013

FWIW, the given example of rake sunspot:reindex[,,true] fails to index anything for me, I assume because the falsy handling for either batch_size or models leaves one or both of those values empty. Specifying both works as expected: rake sunspot:reindex[50,MyModel+MyOtherModel,true]

leppert commented Jul 2, 2013

FWIW, the given example of rake sunspot:reindex[,,true] fails to index anything for me, I assume because the falsy handling for either batch_size or models leaves one or both of those values empty. Specifying both works as expected: rake sunspot:reindex[50,MyModel+MyOtherModel,true]

@tab

This comment has been minimized.

Show comment
Hide comment
@tab

tab Aug 2, 2013

Until 2.1 release @xamut found a hack that works:

rails r 'Sunspot.searchable.each(&:solr_reindex)'

tab commented Aug 2, 2013

Until 2.1 release @xamut found a hack that works:

rails r 'Sunspot.searchable.each(&:solr_reindex)'
@zzak

This comment has been minimized.

Show comment
Hide comment
@zzak

zzak Aug 24, 2013

@marksweston Hi Mark! Think it would be interesting to only display the "large datasets" comment if the index is actually a large number. What do you think?

zzak commented Aug 24, 2013

@marksweston Hi Mark! Think it would be interesting to only display the "large datasets" comment if the index is actually a large number. What do you think?

@njakobsen

This comment has been minimized.

Show comment
Hide comment
@njakobsen

njakobsen Aug 24, 2013

Contributor

Maybe the real answer is the reindex without first dropping the existing records. That way the reindex is incremental. An option could be passed to drop the existing records if necessary. That way we don't have this problem in the first place. Any thoughts?

Contributor

njakobsen commented Aug 24, 2013

Maybe the real answer is the reindex without first dropping the existing records. That way the reindex is incremental. An option could be passed to drop the existing records if necessary. That way we don't have this problem in the first place. Any thoughts?

@felixbuenemann

This comment has been minimized.

Show comment
Hide comment
@felixbuenemann

felixbuenemann Aug 24, 2013

Contributor

@njakobsen That woulde be an option, dropping the index is only required for schema changes.

On another thought: Why doesn't the rake task check for STDOUT.tty? to determine if it's run interactively?

Contributor

felixbuenemann commented Aug 24, 2013

@njakobsen That woulde be an option, dropping the index is only required for schema changes.

On another thought: Why doesn't the rake task check for STDOUT.tty? to determine if it's run interactively?

@rubytastic

This comment has been minimized.

Show comment
Hide comment
@rubytastic

rubytastic Sep 30, 2013

FWIW, the given example of rake sunspot:reindex[,,true] fails to index anything for me, I assume because the falsy        
handling for either batch_size or models leaves one or both of those values empty. Specifying both works as
expected: rake sunspot:reindex[50,MyModel+MyOtherModel,true]

This fails for me on very latest sunspot. Is there a workaround?

rubytastic commented Sep 30, 2013

FWIW, the given example of rake sunspot:reindex[,,true] fails to index anything for me, I assume because the falsy        
handling for either batch_size or models leaves one or both of those values empty. Specifying both works as
expected: rake sunspot:reindex[50,MyModel+MyOtherModel,true]

This fails for me on very latest sunspot. Is there a workaround?

@zzak

This comment has been minimized.

Show comment
Hide comment
@zzak

zzak Sep 30, 2013

I have a working branch on my fork but im on a bus now, so i cant find it for you.

This reminds me to submit a patch upstream. Thanks!

On Sep 30, 2013, at 12:11 PM, rubytastic notifications@github.com wrote:

FWIW, the given example of rake sunspot:reindex[,,true] fails to index anything for me, I assume because the falsy
handling for either batch_size or models leaves one or both of those values empty. Specifying both works as
expected: rake sunspot:reindex[50,MyModel+MyOtherModel,true]
This fails for me on very latest sunspot. Is there a workaround?


Reply to this email directly or view it on GitHub.

zzak commented Sep 30, 2013

I have a working branch on my fork but im on a bus now, so i cant find it for you.

This reminds me to submit a patch upstream. Thanks!

On Sep 30, 2013, at 12:11 PM, rubytastic notifications@github.com wrote:

FWIW, the given example of rake sunspot:reindex[,,true] fails to index anything for me, I assume because the falsy
handling for either batch_size or models leaves one or both of those values empty. Specifying both works as
expected: rake sunspot:reindex[50,MyModel+MyOtherModel,true]
This fails for me on very latest sunspot. Is there a workaround?


Reply to this email directly or view it on GitHub.

@nz

This comment has been minimized.

Show comment
Hide comment
@nz

nz Sep 30, 2013

Member

So I think we've all realized by now that this "warning" is really way too aggressive. I'm actually in favor of dropping it entirely until some of the concerns above are addressed.

Now that we've all had a taste of a reindex warning, here's what I think would be a more appropriate and user-friendly design:

  1. Change the reindex behavior to only delete when explicitly instructed, making the default less destructive in the first place.

    Some people use the reindex to clear out documents that may not have been deleted normally through after_delete hooks, but it is probably more common just to backfill a new field in your searchable block. I'm not sure whether most people assume that reindex will delete their index first, but documenting the new flag would make the different behaviors clear.

    I'm not sure whether this counts as a backwards-incompatible change, or whether it's entirely necessary. It needs to be well-documented, at the least. Possible variable names: FORCE (a la Tire), CLEAR, PURGE, etc.

  2. If the deletion was explicitly requested, we can change the interactive confirmation to a non-interactive countdown. Start with the assumption that the developer knows what they are asking for, but allow them a few seconds to change their mind.

    Even if we don't make the delete an explicit option, I still feel like we should change the

  3. Skip the warning entirely outside of production, and/or when there are fewer than 100k documents.

Anyone want to come up with a pull request for that? Should be pretty straightforward.

Thanks for all the discussion, @felixbuenemann @zzak @njakobsen @rubytastic @leppert @tab. Let's get this fixed up.

Member

nz commented Sep 30, 2013

So I think we've all realized by now that this "warning" is really way too aggressive. I'm actually in favor of dropping it entirely until some of the concerns above are addressed.

Now that we've all had a taste of a reindex warning, here's what I think would be a more appropriate and user-friendly design:

  1. Change the reindex behavior to only delete when explicitly instructed, making the default less destructive in the first place.

    Some people use the reindex to clear out documents that may not have been deleted normally through after_delete hooks, but it is probably more common just to backfill a new field in your searchable block. I'm not sure whether most people assume that reindex will delete their index first, but documenting the new flag would make the different behaviors clear.

    I'm not sure whether this counts as a backwards-incompatible change, or whether it's entirely necessary. It needs to be well-documented, at the least. Possible variable names: FORCE (a la Tire), CLEAR, PURGE, etc.

  2. If the deletion was explicitly requested, we can change the interactive confirmation to a non-interactive countdown. Start with the assumption that the developer knows what they are asking for, but allow them a few seconds to change their mind.

    Even if we don't make the delete an explicit option, I still feel like we should change the

  3. Skip the warning entirely outside of production, and/or when there are fewer than 100k documents.

Anyone want to come up with a pull request for that? Should be pretty straightforward.

Thanks for all the discussion, @felixbuenemann @zzak @njakobsen @rubytastic @leppert @tab. Let's get this fixed up.

@marksweston

This comment has been minimized.

Show comment
Hide comment
@marksweston

marksweston Oct 1, 2013

@nz I think that's a good plan, but I do think it counts as a breaking change. We rely on sunspot:reindex clearing out the old index entries because changes to our data are made in a different application and then "published" to our customer-facing app (the one that uses Sunspot and Solr). Thus after_delete hooks aren't triggered.

It's not the usual use-case for Sunspot but I'm sure we're not the only ones doing something unusual. And if I hadn't been following this thread and found out later that the meaning of the sunspot:reindex task had changed in a way that meant references to non-existent database records were being left in my Solr index... Well, I doubt I'd be happy ;)

marksweston commented Oct 1, 2013

@nz I think that's a good plan, but I do think it counts as a breaking change. We rely on sunspot:reindex clearing out the old index entries because changes to our data are made in a different application and then "published" to our customer-facing app (the one that uses Sunspot and Solr). Thus after_delete hooks aren't triggered.

It's not the usual use-case for Sunspot but I'm sure we're not the only ones doing something unusual. And if I hadn't been following this thread and found out later that the meaning of the sunspot:reindex task had changed in a way that meant references to non-existent database records were being left in my Solr index... Well, I doubt I'd be happy ;)

@mkrisher

This comment has been minimized.

Show comment
Hide comment
@mkrisher

mkrisher Oct 1, 2013

Contributor

@nz 👍 to the idea of an explicit option to drop the indexes on reindex. My original warning message was a reaction and should have been implemented differently. Adding the explicit option and changing the default behavior seems like the best course of action.

Contributor

mkrisher commented Oct 1, 2013

@nz 👍 to the idea of an explicit option to drop the indexes on reindex. My original warning message was a reaction and should have been implemented differently. Adding the explicit option and changing the default behavior seems like the best course of action.

@zzak

This comment has been minimized.

Show comment
Hide comment
@zzak

zzak commented Oct 5, 2013

@felixbuenemann

This comment has been minimized.

Show comment
Hide comment
@felixbuenemann

felixbuenemann Oct 7, 2013

Contributor

@nz I like option 1 but maybe due to backwards compatibility there should be different tasks like reindex and update_index.

@mkrisher You could also hook into Rake's --quiet or --silent which are exposed via getters on Rake.application.options.

Contributor

felixbuenemann commented Oct 7, 2013

@nz I like option 1 but maybe due to backwards compatibility there should be different tasks like reindex and update_index.

@mkrisher You could also hook into Rake's --quiet or --silent which are exposed via getters on Rake.application.options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment