Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tarantula deletes records and then complains about 404s when trying to access those records #3

Open
ghost opened this issue May 17, 2009 · 8 comments
Labels

Comments

@ghost
Copy link

ghost commented May 17, 2009

Tarantula is complaining about 404 errors in a vanilla application built using Rails scaffolding for a single ActiveRecord model. It appears that Tarantula is performing actions an incorrect order. It appears that Tarantula is first deleting a record and then attempting to access the ’show’ page for that record. The application correctly responds with a 404 error (since the record no longer exists), but Tarantula reports the 404 error as a test failure.

Compare this behavior to the way a user would interact with the app via a browser: Once a user deletes a record, the app redisplays the index page and does not show a link to the deleted record. Therefore, there’s no link that the user could follow to produce a 404 error.

How to Repeat

  1. Create a simple scaffolded Rails app (like the one attached to this ticket).
  2. Generate the default Tarantula test (using rake tarantula:setup)
  3. Run the Tarantula test
  4. Observe test failure

Test Output

  [store] rake tarantula:test VERBOSE=true
  Loaded suite /Users/jason/.gem/ruby/1.8/gems/rake-0.8.3/lib/rake/rake_test_loader
  Started
  Response 200 for <Relevance::Tarantula::Link href=/, method=get>
  ...
  Response 302 for <Relevance::Tarantula::Link href=/products/996332877, method=delete>
  Response 404 for <Relevance::Tarantula::Link href=/products/996332877/edit, method=get>
  Response 404 for <Relevance::Tarantula::Link href=/products/996332877, method=get>
  Response 302 for <Relevance::Tarantula::Link href=/products/953125641, method=delete>
  Response 404 for <Relevance::Tarantula::Link href=/products/953125641/edit, method=get>
  Response 404 for <Relevance::Tarantula::Link href=/products/953125641, method=get>
  Response 302 for /products post {"commit"=>-2303, "product[price]"=>-3795, "product[name]"=>-4472}
  ...
  ****** FAILURES
  404: /products/996332877/edit
  404: /products/996332877
  404: /products/953125641/edit
  404: /products/953125641
  E
  Finished in 1.266037 seconds.

original LH ticket

This ticket has 1 attachment(s).

@ghost
Copy link
Author

ghost commented May 17, 2009

Tarantula deletes records and then complains about 404s when trying to access those records

If you modify the index.html.erb file and change the delete link_to to a button_to, everything works fine.

Theoretically, your links shouldn’t be pointing to deletes, but rather, they should be behind forms, that way web-crawlers don’t hit things and delete them for you, so button_to is probably more appropriate, but the scaffold does generate the link_to as a default.

Probably something that should be fixed, because it won’t occur to most people to not use link_to for deletes, but if you’re following good practices, it shouldn’t be an issue.

by Kevin Gisi

@glv
Copy link
Contributor

glv commented Sep 4, 2009

Now that I've got forms and links unified onto a single crawl queue, we should be able to reorder things while on the queue. I need to replace the current queue (a simple Array) with a priority queue, but once that's done, here's the plan.

In FormSubmission, I can use the following line of code to learn the controller, action, and other parameters for a form action:

ActionController::Routing::Routes.recognize_path(@action, :method => @method.to_sym)

For a given controller, I need to sort actions this way:

  1. new
  2. create
  3. index
  4. edit
  5. update
  6. show
  7. anything else
  8. delete

Of course, in most apps index will probably have to be crawled first, before any of the others are even seen, and then after doing a create I may not even see the show, edit, or delete actions for the newly created object unless I visit index again. So I may need to add some smarts to add index to the queue again. This will require some experimentation.

@masterkain
Copy link

any words on this? I'm experiencing the same using 0.3.3 . thanks

@jasonrudolph
Copy link
Contributor

No rock-solid solution just yet. There's a workaround you can consider using, but it's admittedly less than ideal.

def test_crawl
  spider = tarantula_crawler(self)

  # Ignore Tarantula failures that occur when it tries to crawl a previously-deleted resouce
  spider.allow_404_for %r{/articles/\d+/edit}
  spider.allow_404_for %r{/authors/\d+/edit}

  spider.crawl "/"
end

I hope this helps in the meantime.

Cheers,
Jason

@masterkain
Copy link

I'll try the suggestion but defeats the purpose of tarantula a bit, allowing 404s to hide the real problem in medium complex application.

Anyway I changed all my link_to deletes to button_to but still no dice, Tarantula is acting this way again. It should be noted that the delete button is only on the /edit object form view and doesn't exists in index pages.

@masterkain
Copy link

sorry for spamming, however I have to reconsider this is as a semi-feature.

In some forms I have cascade selects that relies on data from the controller, this data is tied to an object that the controller assumes to exists, so scoped like obj.obj1.collection; if obj1 or obj are deleted before showing this view an exception is raised, because probably the object wasn't meant to be destroyed in first place or because a validation elsewhere isn't doing its job.
So when Tarantula catches it it's a good idea to go back and revise model's destroy policy and fix your code accordingly.

If Tarantula didn't destroy the object prior in showing this controller's view the problem may not have been discovered, unless some other tests were in place.

All in all I can live with a tons of 404s in the report.

@glv
Copy link
Contributor

glv commented Oct 19, 2009

I just want to file an update on this.

It was apparent early on that fixing this problem would require some major changes in Tarantula---a big internal refactoring and a change to the configuration interface, for starters. So while I can't say a fix for this is imminent, I can say that the groundwork has mostly been laid. I'm preparing a 0.4 release that includes a new configuration interface and an overhaul of how Tarantula keeps track of the crawl in progress. (If you want a taste of what the new config interface looks like, check #9.)

I hope that release will go out this week, and then the next order of business is to tackle this bug. (And don't worry, it'll still be possible to run crawls the same way they work now, so you can still find the problems you just mentioned.)

@markmcspadden
Copy link

I kind of took things a different direction over on the garlandgroup fork. I've added read_only and non_destructive attributes to the Crawler object. This tells the crawler to 1) skip all non get methods when read_only is true OR 2) skip delete methods when non_destructive is true.

t = tarantula_crawler(self)
t.read_only = true # or maybe you prefer: t.non_destructive = true
t.crawl "/"

It's definitely a flawed approached, and I still think the reordering is the right long term solution, but it's getting me closer to where I want to be with tarantula running as part of our test suite.

(To me it's much more digestible while setting up this suite to start with just the read onlys, then move on to non destructive, then finally the whole enchilada.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants