Skip to content
This repository

Improved search feature (elasticsearch based, demo available) #455

Closed
wants to merge 22 commits into from

12 participants

Karel Minarik Don't Add Me To Your Organization a.k.a The Travis Bot Evan Phoenix Nick Zadrozny Christopher Meiklejohn Amos King Nick Quaranto Gustavo Barron Vipul A M knappe Sam Kottler Jimmy Cuadra
Karel Minarik

This pull request contains a proposed search feature overhaul for Rubygems.org, implemented with the elasticsearch search engine, via the Tire library.

Objectives

The main objective of the effort is to allow searching in more gem properties then just their names, notably in summaries, descriptions and authors – technically speaking, to increase both precision and recall of search at Rubygems.org.

Using a search engine — as opposed to a LIKE %term% database query — allows not only for better, faster searches, but also for advanced features such as a rich search query language, faceted navigation, and more.

Changelog

All the steps required for implementing the feature are commited on the search-steps branch, with extensive commit messages documenting the process. The important steps are:

  • 86240f2 and 3a17730 implement the most simple search with elasticsearch, adding model integration and using Tire in the controller

  • fa7d1cd adds complex mapping definition for the Rubygem model, allowing to search in gem summaries/descriptions, authors, dependencies, and more.

  • 8c6bee8 adds a sliding panel to the search results page which contains examples of searches with Lucene search query syntax.

Additional Cucumber scenarios were added to document the new search features. Some additional tweaks were required to run the test suite successfuly at Travis CI.

Please review the branch compare page to see the full picture.

Demo Application

A demo application is available at http://rubygems-with-elasticsearch.herokuapp.com.

(UPDATE, **new demo server** here: ****

UPDATE: test servers terminated.

Try out simple searches such as rack or searching in authors: author:john and dependencies: uses:rack. More tips are available as in-page help.

The database contains only a limited subset of gems. The application is running on a free Heroku plan. The elasticsearch service is running on a Amazon EC2 t1.micro instance. Keep in mind, that the application runs in a tweaked development mode (due to issues with assets etc.), so the demo application performance does not reflect the performance in the real production environment.

If a dump of the Rubygems production database would be available, I'd like to import it into the demo application database.

Further Development

If the proposed search implementation is considered desirable, a number of further developments is possible, eg.:

  • more fine-grained score computation based on number of downloads, not straight sorting,
  • allow sorting the results by number of downloads, alphabetically, by created or updated time,
  • highlighting the relevant matched snippets from gem properties,
  • adding faceted search on authors,
  • linking to a specific matched version from search results,
  • displaying aggregated statistics such as authors with most gems, authors with most downloaded gems, etc.,
  • adding Tire's NewRelic instrumentation to track performance.

Installation Instructions

To check out the search feature locally, assuming you had cloned the Rubygems.org repository, set it up according to instructions first:

./script/setup

To import your local gems into the database, run:

bundle exec rake gemcutter:import:process

Then, install elasticsearch using your preffered method. On Mac OS X, the easiest way is to use Homebrew:

brew install elasticsearch

To import the gems from the database into elasticsearch, run:

bundle exec rake environment tire:import CLASS='Rubygem' FORCE=1
Don't Add Me To Your Organization a.k.a The Travis Bot

This pull request fails (merged 8c6bee8 into 7ac6d16).

Karel Minarik

As noted, @travisbot requires some additional tweaks to run the test suite. (Still, Travis is very unreliable when running the full test suite, see http://travis-ci.org/#!/karmi/rubygems.org/builds)

[Edit] example of a successfull test run: http://travis-ci.org/#!/karmi/rubygems.org/jobs/2286458

Evan Phoenix
Owner

Looks great! I don't have any experience with elasticsearch, does it require running another service? If so, there isn't any details on how to get that service running and we'll need that.

Nick Zadrozny
nz commented August 30, 2012

Looks great, @karmi!

@evanphx: I'd consider it a privilege to sponsor the search hosting on http://bonsai.io/

Karel Minarik

@evanphx Thanks! Right now, the elasticsearch service for the demo application runs at EC2 instance, provisioned with Chef. I think there would be no problem getting someone to sponsor the box, possibly including elasticsearch.com company.

As @nz points out, Bonsai is available to host the search service as well -- though issues with the Tire library and Bonsai would have to be sorted out...

Evan Phoenix
Owner

I'm wary of having rubygems.org depend on an external service like bonsai.io. There is A LOT of traffic and I worry the site would depend too heavily on the reliability of something we don't have control over.

Karel Minarik

I understand the concern, Evan.

However, if we want to make the search at Rubygems.org radically better, there's no way around it then depend on some external factor in one way or another — all the major search engines are external processes/services (except TSearch).

It's similar, in fact, to dependency on Redis (for tracking downloads) in the current codebase.

elasticsearch itself is open source and free, based on Lucene, written in Java, and can be run trivially on any real or virtual server. It's particularly suited to run in cloud environments such as Amazon AWS (ie. with little latency to Heroku), but is in no way tied or affiliated with Amazon.

(Regarding the work needed to set up, configure and maintain an elasticsearch server, I can handle such duties just fine.)

If we want to move forward with the proposed search functionality, I think we need to work on these points:

  1. Is the proposed feature something we want to use, eventually, at Rubygems.org? If so, let's discuss what steps are required to merge it into master and roll it into production.

  2. Can a full dump of the Rubygems production database be provided? If so, let me load it up to the demo application so everybody can try various kinds of searches and kick the tires on the feature.

  3. If everybody's happy with the proposed search feature, let's work on polishing it further -- the first thing is using Rubygem#downloads as a score boosting factor, not as a straightforward sorting criterion.

Evan Phoenix
Owner

I'm sorry, I wasn't clear. I don't have an issue running the elasticsearch service on the rubygems.org servers. I am worried about using a hosted elasticsearch service because then usage of it is dependent on a lot more (network conditions, cloud health, etc).

As for the running of it, thats very kind of you to offer to setup, configure, and maintain it but that likely won't work because then you'd need to be effectively on-call all the time. We don't have an issue maintaining it, but I would like some guidance into how it should be configured, how much disk/memory it will use, etc.

Nick Zadrozny
nz commented August 31, 2012

@evanphx: ElasticSearch has a pretty good guide for self-hosting on Amazon: http://www.elasticsearch.org/tutorials/2011/08/22/elasticsearch-on-ec2.html

FWIW, we made the same offer to host the search at websolr when Solr was on the table a while back. I did some digging with qrush at one point into the question of traffic volume, and was completely comfortable with the numbers. Besides that, we literally are on call all the time :-)

That said I get the value of self-hosting for you here, and am happy to be available to talk tech when it comes to hosting ES. I'll idle in #gemcutter today (nz) if you want to talk more about capacity planning, which is almost always an experimental process.

Nick Zadrozny
nz commented August 31, 2012

Er, make that #rubygems :)

Karel Minarik

Thanks for the clarification, Evan!

I don't have an issue running the elasticsearch service on the rubygems.org servers.
We don't have an issue maintaining it, but I would like some guidance into
how it should be configured, how much disk/memory it will use, etc.

Perfect! elasticsearch is pretty easy to install and operate; in terms of required resources, for the Rubygems.org use case a modest machine will be more then enough.

I can certainly help with the installation and configuration of elasticsearch on your servers — just ask the specifics! The easiest way is to use the Chef cookbook. Please see the tutorial at the elasticsearch.org site.

For the Rubygems.org use case, one elasticsearch node should be enough, though for proper failover and scalability, two nodes would be desirable. In terms of resources needed, elasticsearch needs mostly RAM. Any modest machine comparable to EC2 small to large would be enough, assuming it has couple of gigabytes of memory to spare. (Note, that the demo application uses the micro instance and happily purrs along with just 613MB of RAM.)

Provided the database dump from Rubygems.org is available, I can do some capacity testing with the full set of data against EC2 instances.

Christopher Meiklejohn
Collaborator

@karmi

Looks like I'm getting some errors on the console when running the test suite, but the tests aren't failing. Is this something to be concerned with?

# Running tests:

..................................................................................................................................................................................................................................................................................................................................................................................................[REQUEST FAILED] curl -X GET "http://localhost:9200/test_rubygems/rubygem/_search?load%5Binclude%5D=versions&page=&per_page=30&size=30&pretty=true" -d '{"query":{"bool":{"should":[{"text":{"name":{"query":"bang!","type":"phrase_prefix","operator":"and","boost":100}}},{"query_string":{"query":"bang!","default_operator":"and"}}]}},"sort":[{"downloads":"desc"},{"name.raw":"asc"}],"filter":{"term":{"indexed":true}},"size":30}'
.[REQUEST FAILED] curl -X GET "http://localhost:9200/test_rubygems/rubygem/_search?load%5Binclude%5D=versions&page=&per_page=30&size=30&pretty=true" -d '{"query":{"bool":{"should":[{"text":{"name":{"query":"bang!","type":"phrase_prefix","operator":"and","boost":100}}},{"query_string":{"query":"bang!","default_operator":"and"}}]}},"sort":[{"downloads":"desc"},{"name.raw":"asc"}],"filter":{"term":{"indexed":true}},"size":30}'
.[REQUEST FAILED] curl -X GET "http://localhost:9200/test_rubygems/rubygem/_search?load%5Binclude%5D=versions&page=&per_page=30&size=30&pretty=true" -d '{"query":{"bool":{"should":[{"text":{"name":{"query":"bang!","type":"phrase_prefix","operator":"and","boost":100}}},{"query_string":{"query":"bang!","default_operator":"and"}}]}},"sort":[{"downloads":"desc"},{"name.raw":"asc"}],"filter":{"term":{"indexed":true}},"size":30}'
.[REQUEST FAILED] curl -X GET "http://localhost:9200/test_rubygems/rubygem/_search?load%5Binclude%5D=versions&page=&per_page=30&size=30&pretty=true" -d '{"query":{"bool":{"should":[{"text":{"name":{"query":"bang!","type":"phrase_prefix","operator":"and","boost":100}}},{"query_string":{"query":"bang!","default_operator":"and"}}]}},"sort":[{"downloads":"desc"},{"name.raw":"asc"}],"filter":{"term":{"indexed":true}},"size":30}'
........................................................................

Finished tests in 115.507678s, 3.9911 tests/s, 6.4411 assertions/s.

461 tests, 744 assertions, 0 failures, 0 errors, 0 skips
Christopher Meiklejohn cmeiklejohn closed this September 08, 2012
Christopher Meiklejohn
Collaborator

Ugh, whoops for the close. Github UI failure.

Christopher Meiklejohn cmeiklejohn reopened this September 08, 2012
Karel Minarik

@cmeiklejohn Yes, that is intentional -- it's the Tire's STDERR output, coming from tests for "user enters invalid Lucene query", bang! in this case. See https://github.com/karmi/rubygems.org/blob/search-steps/test/functional/searches_controller_test.rb#L57-66 and https://github.com/karmi/rubygems.org/blob/search-steps/features/search.feature#L67-70

Amos King
Collaborator

:+1: I went to the test site and loved the functionality that it provides. What can we do to get this brough back up?

app/models/rubygem.rb
... ...
@@ -12,7 +15,48 @@ class Rubygem < ActiveRecord::Base
12 15
   validate :ensure_name_format
13 16
   validates :name, :presence => true, :uniqueness => true
14 17
 
15  
-  after_create :update_unresolved
  18
+  after_create :update_unresolved, :update_elasticsearch_index
  19
+  after_touch  :update_elasticsearch_index
  20
+
  21
+  tire do
2
Nick Quaranto Owner
qrush added a note March 28, 2013

All of this is perfect for a Concern module, something like Searchable.

class Rubygem < ActiveRecord::Base
  include Searchable

And that module has all of the necessary includes, methods, etc. Any thoughts about that approach?

Karel Minarik
karmi added a note March 28, 2013

Nothing against such approach -- normally, I like to keep mapping/etc definitions inside the model, and since the after_create :update_unresolved hook was already there, I just followed the convention. Do you want to extract everything related to search to a module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Nick Quaranto qrush commented on the diff March 28, 2013
features/support/env.rb
... ...
@@ -4,6 +4,9 @@
4 4
 # instead of editing this one. Cucumber will automatically load all features/**/*.rb
5 5
 # files.
6 6
 
  7
+require 'webmock/cucumber' # Allow connections to elasticsearch
2
Nick Quaranto Owner
qrush added a note March 28, 2013

Does this mean the test suite is dependent on an elasticsearch install? How would this work on Travis, etc?

Karel Minarik
karmi added a note March 28, 2013

The Cucumber integration test is indeed dependent on Elasticsearch running, since that's the only way how to end-to-end test the feature? Elasticsearch is available on Travis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
app/controllers/searches_controller.rb
... ...
@@ -1,8 +1,31 @@
1 1
 class SearchesController < ApplicationController
2 2
 
  3
+  # Indicate incorrect query to the user
  4
+  rescue_from Tire::Search::SearchRequestFailed do |error|
2
Nick Quaranto Owner
qrush added a note March 28, 2013

Does this cover the case where ES is completely unavailable/disconnected?

Karel Minarik
karmi added a note March 28, 2013

No, that would have to be handled by a separate rescue_from clause, displaying an error such as "We're sorry, search is currently not available".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Nick Quaranto
Owner
qrush commented March 28, 2013

Left a few comments. I think we should get ES setup in rubygems/rubygems-aws soon so we can start playing with it...maybe we can wire up the test heroku app to give it a test first.

Some more feedback:

  • Let's get info for how to get ES setup in CONTRIBUTING.md
  • Does anyone else have experience with maintaining a running cluster? What if there's problems? Who will get alerted, who will debug it, etc? (I'm trying to answer this now instead of when the fire is blazing)
  • What if search goes down, can we fall back to the old search?
Karel Minarik
karmi commented March 28, 2013

I think we should get ES setup in rubygems/rubygems-aws soon

Please keep me in the loop, I'm the author of the Chef cookbook.

Let's get info for how to get ES setup in CONTRIBUTING.md

I'll put it in, and force push the commits here.

Does anyone else have experience with maintaining a running cluster?

I'm employed by Elasticsearch.com and have some experience with running Elasticsearch clusters :) Let's talk about the exact process.

What if search goes down, can we fall back to the old search?

I don't think that's a good solution. A running Elasticsearch cluster shouldn't just go down -- we just need to ensure there is proper monitoring on the service itself and EC2 level?

Nick Quaranto
Owner
qrush commented March 28, 2013

@karmi: This all sounds awesome :) Would you be willing to contribute that into https://github.com/rubygems/rubygems-aws ? I'm very sure our new ops contributors would be more than willing to help get everything set up.

"Shouldn't just go down" is not what I've seen...I'd rather account/test for that now when we're doing the switch and migration instead of when it's on fire.

Karel Minarik
karmi commented March 28, 2013

Yes, I'll setup the environment for rubygems-aws and add a pull request for Elasticsearch.

As for going down, all services and servers can go down :) But I think we need to come up with a process for that, instead of falling back on the SQL based search; that just doesn't feel right. There are many aspects here, eg. having nodes properly distributes across AWS zones, having an automated strategy for recovering from backup or reindexing from scratch, etc.

Nick Quaranto
Owner
qrush commented March 28, 2013

Cool. That would be neat. We don't even have any of that in place for the main app yet (AFAIK)

Karel Minarik karmi referenced this pull request from a commit in karmi/rubygems-aws April 27, 2013
Karel Minarik [SEARCH] Added configuration for Elasticsearch nodes
This commit adds support for search nodes running Elasticsearch.

* The "elasticsearch" cookbook [https://github.com/elasticsearch/cookbook-elasticsearch/]
  has been added to the Cheffile

* A Vagrant VM named `search` has been added

* A `search` role has been added

* Node configurations (*.json) for Vagrant and EC2 have been added

* The Capistrano tasks have been updated to reflect the changes

To deploy in EC2:

    # Update packages
    #
    RUBYGEMS_EC2_SEARCH=abc-123.compute-1.amazonaws.com \
    DEPLOY_USER=ubuntu \
    DEPLOY_SSH_KEY=~/.ssh/mykey.pem \
      cap rubygems.org invoke COMMAND='sudo apt-get update' SUDO=true

    # Install Chef
    #
    RUBYGEMS_EC2_SEARCH=abc-123.compute-1.amazonaws.com \
    DEPLOY_USER=ubuntu \
    DEPLOY_SSH_KEY=~/.ssh/mykey.pem \
      cap rubygems.org invoke COMMAND='curl -# -L http://www.opscode.com/chef/install.sh | sudo bash -s --' SUDO=true

    # Run Chef
    #
    time \
    RUBYGEMS_EC2_SEARCH=abc-123.compute-1.amazonaws.com \
    DEPLOY_USER=ubuntu \
    DEPLOY_SSH_KEY=~/.ssh/mykey.pem \
      cap rubygems.org chef:search

Related: rubygems/rubygems.org#455
aaa86e5
Karel Minarik karmi referenced this pull request from a commit in karmi/rubygems-aws May 09, 2013
Karel Minarik [SEARCH] Added a template for Elasticsearch application initializer
The `elasticsearch_url` variable is set in the "secret/rubygems" data bag,
similar to setting PostgreSQL host, etc.

Alternatively, an environment variable `ELASTICSEARCH_URL` could be used.

Related: rubygems/rubygems.org#455
0f9c517
Karel Minarik karmi referenced this pull request in rubygems/rubygems-aws May 10, 2013
Merged

Added Elasticsearch integration #122

Karel Minarik karmi referenced this pull request from a commit in karmi/rubygems-aws April 27, 2013
Karel Minarik [SEARCH] Added configuration for Elasticsearch nodes
This commit adds support for search nodes running Elasticsearch.

* The "elasticsearch" cookbook [https://github.com/elasticsearch/cookbook-elasticsearch/]
  has been added to the Cheffile

* A Vagrant VM named `search` has been added

* A `search` role has been added

* Node configurations (*.json) for Vagrant and EC2 have been added

* The Capistrano tasks have been updated to reflect the changes

To deploy in EC2:

    # Update packages
    #
    RUBYGEMS_EC2_SEARCH=abc-123.compute-1.amazonaws.com \
    DEPLOY_USER=ubuntu \
    DEPLOY_SSH_KEY=~/.ssh/mykey.pem \
      cap rubygems.org invoke COMMAND='sudo apt-get update' SUDO=true

    # Install Chef
    #
    RUBYGEMS_EC2_SEARCH=abc-123.compute-1.amazonaws.com \
    DEPLOY_USER=ubuntu \
    DEPLOY_SSH_KEY=~/.ssh/mykey.pem \
      cap rubygems.org invoke COMMAND='curl -# -L http://www.opscode.com/chef/install.sh | sudo bash -s --' SUDO=true

    # Run Chef
    #
    time \
    RUBYGEMS_EC2_SEARCH=abc-123.compute-1.amazonaws.com \
    DEPLOY_USER=ubuntu \
    DEPLOY_SSH_KEY=~/.ssh/mykey.pem \
      cap rubygems.org chef:search

Related: rubygems/rubygems.org#455
02d8518
Karel Minarik karmi referenced this pull request from a commit in karmi/rubygems-aws May 09, 2013
Karel Minarik [SEARCH] Added a template for Elasticsearch application initializer
The `elasticsearch_url` variable is set in the "secret/rubygems" data bag,
similar to setting PostgreSQL host, etc.

Alternatively, an environment variable `ELASTICSEARCH_URL` could be used.

Related: rubygems/rubygems.org#455
8e0e3dd
Karel Minarik
karmi commented May 10, 2013

Hi all, rebased the branch against current master and added some commits. There's a new test server available here:

http://54.235.152.92:3000/search?utf8=✓&query=name%3Arack

which has been created as part of the rubygems/rubygems-aws#122 pull request.

features/step_definitions/gem_steps.rb
((6 lines not shown))
  75
+  table.hashes.each do |row|
  76
+    # p 'GOT TABLE ROW:', row, '-'*80
  77
+    if row['downloads']
  78
+      rubygem = FactoryGirl.create :rubygem_with_downloads, :name => row['name'], :downloads => row['downloads']
  79
+    else
  80
+      rubygem = FactoryGirl.create :rubygem, :name => row['name']
  81
+    end
  82
+
  83
+    FactoryGirl.create(:version, :rubygem => rubygem) do |version|
  84
+      version.number      = row['version']
  85
+      version.authors     = row['authors'].split(/\s*,\s*/)
  86
+      version.summary     = row['summary']
  87
+      version.description = row['description']
  88
+
  89
+      version.save
  90
+      # p "CREATED RUBYGEM:", version.rubygem, version, '-'*80
2
Vipul A M
vipulnsward added a note May 10, 2013

this p could be removed now

Karel Minarik
karmi added a note May 15, 2013

Both removed in karmi/rubygems.org@dcb887a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
features/step_definitions/gem_steps.rb
... ...
@@ -65,3 +70,24 @@
65 70
     rubygem.ownerships.create :user => user
66 71
   end
67 72
 end
  73
+
  74
+Given /^gems with these properties exist:$/ do |table|
  75
+  table.hashes.each do |row|
  76
+    # p 'GOT TABLE ROW:', row, '-'*80
1
Vipul A M
vipulnsward added a note May 10, 2013

ditto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Karel Minarik
karmi commented May 15, 2013

@vipulnsward Commented out debug statements for Cucumber removed in karmi/rubygems.org@dcb887a.

Gustavo Barron
cicloid commented May 24, 2013

Just saw the pull request, is there something in need of doing or testing, in order to move this forward?

Karel Minarik

karmi opened this pull request a year ago

Guys, we just passed an anniversary with this pull request. What should be done with it? Should I close it?

/cc @qrush @evanphx @skottler

Vipul A M

:cry: I hope not.

knappe

Can we get some more traction on this? This is a very intriguing feature set.

/cc @qrush @evanphx @skottler

Sam Kottler
Collaborator

@karmi can you please rebase?

@knappe the best way to help move this forward is to do a thorough code review.

added some commits August 24, 2012
Karel Minarik [SEARCH] Added "tire" dependency for searching Rubygems.org with elas…
…ticsearch

elasticsearch is an open source search engine based on Lucene, with a RESTful HTTP interface and advanced distributed features.

Tire is a Ruby API/DSL for elasticsearch, with an out-of-the box ActiveRecord/ActiveModel integration.

See:

* "Tire": https://github.com/karmi/tire
* "elasticsearch": http://elasticsearch.org
e18cfb2
Karel Minarik [SEARCH] Allow connections to elasticsearch [localhost:9200] in tests…
… and Cucumber

NOTE: The `disable_net_connect!` call has to come *before* we load the application,
      because Tire checks for index existence on application boot, and shoots
      the entire test suite down.

      See <karmi/retire#136> for more information.
6632bc2
Karel Minarik [SEARCH] Added elementary Tire integration into the Rubygem model
* Added, that the Version model propagates touches to Rubygem [See: http://stackoverflow.com/a/11711477/95696]

* Added Tire ActiveRecord callbacks [See: https://github.com/karmi/tire#activemodel-integration]

* Added a simple mapping definition for Rubygem

* Added a simple `to_indexed_json` serialization for Tire

* Fixed incorrect test case in WebHookTest ("include an Authorization header"):
  1) use the `build`, not the `create` FactoryGirl strategy (to skip Tire indexing), and,
  2) use the _last_ HTTP request from WebMock registry (to skip Tire checking if the index exists)

* Fixed failing "Web Hooks" feature, using the _last_ HTTP request from WebMock registry (see above)

Import your current database with the following Rake task:

    $ bundle exec rake environment tire:import CLASS='Rubygem' FORCE=1

Check the index in your browser:

    <http://localhost:9200/development_rubygems/_search>
2e56fcb
Karel Minarik [SEARCH] Added the simplest possible search with elasticsearch
* Added simple query string search into SearchesController

* Recreate elasticsearch index in the SearchesController functional test setup and in the Cucumber `Before('@search')` callback

* Trigger index update in the FactoryGirl `after(:create)` callback

* Be more defensive in ApplicationHelper#short_info (in test, some gems don't have versions?)

Note: The "beer laser" => "beer_laser" Cucumber scenario fails,
      due to incorrect analysis of Gem names with underscore.
611774a
Karel Minarik [SEARCH] Added, that factories trigger `touch` callbacks after create
NOTE: We need to be absolutely sure the `Rubygem` instance is touched, because
      we rely on it being indexed in elasticsearch in integration tests.
70f0f1f
Karel Minarik [SEARCH] Added proper analyzer for Rubygem names
With the original (standard) analyzer, a Gem name like "url_mount" would be analyzed as "url_mount",
making searches for "url mount" (without the underscore) fail.

With the new analyzer, *tokens* are split by "special characters" defined by the `Patterns` module (.-_).

Try it out yourself:

  <http://localhost:9200/development_rubygems/_analyze?text=url_mount&field=name>

This change makes the "beer laser" => "beer_laser" Cucumber scenario pass.
2389438
Karel Minarik [SEARCH] Changed the search definition to a DSL-based syntax, added s…
…orting by downloads

* Used the DSL notation for defining the search: using a `match` prefix query on the "name" field,
  basically replicating the simple query string search with wildcards with a more performant version,
  and using a filter on the `indexed` property

* Added sorting of the results by downloads (descending)

* Added a Cucumber scenario for showing the more downloaded gems higher in search results

* Added supporting Cucumber code: a "I have a gem with downloads" and "I see these search results" step definitions
bfd2aa7
Karel Minarik [SEARCH] Changed, that search results are ordered first by downloads,…
… then alphabetically

* Changed the `name` property to multi-field, using the "keyword" analyzer on `name.raw` for searching
* Added the sort block with multiple sort fields
* Added a Cucumber scenario

NOTE: Now we should really stop and think twice about how to make the results more relevant.
      We should be able to get better search _precision_ by using the `Rubygem#downloads`
      counter as a factor affecting score, not just plainly sort on its value.
db6e1d3
Karel Minarik [SEARCH] Added a more complex mapping definition and serialization fo…
…r the Rubygem model

* Previously, only the `name`, `downloads` and `indexed` attributes were indexed,
  replicating the functionality of the current search feature.

* The `to_indexed_json` method was removed, relying on Tire's JSON serialization routines
  based on the model mapping definition.

* The `summary`, `description` and `author` gem properties were added, allowing much better
  search results _recall_, ie. allowing search in these fields as well and widening the search “net”.

* A gem which mentions "sinatra" in it's summary/description will now be matched (with a lower score):
  <http://localhost:3000/search?query=sinatra>.

* A gem written by Florian Hanke will now be matched: http://localhost:3000/search?query=florian+hanke

* The `version` gem property was added, allowing searches based on gem versions, for instance looking
  for Sinatra 1.3.2: <http://localhost:3000/search?query=name:sinatra+version:1.3.2>. For improved
  usability, the link from the result listing _should_ lead to the relevant version page,
  ie. http://localhost:3000/gems/sinatra/versions/1.3.2, not the last version.

* The `depends` and `uses` gem properties were added, which index runtime gem dependencies and all
  gem dependencies, respectively. It allows searches such as <http://localhost:3000/search?query=depends:rack>
  (for gems with depend on rack) or http://localhost:3000/search?query=uses:rack (for gems which use rack in
  one way or other).

* The `created_at` and `updated_at` gem properties were added, which allow to search gems updated in a specific
  period, for instance on August, 26th: <http://localhost:3000/search?query=updated_at:[2012-08-26+TO+2012-08-27]>

* The `author`, `created_at` and `updated_at` also allow for a _faceted navigation_ in the future, ie. searching
  for certain gem while restricting the result to certain author or time.

* These properties also allow for computing statistics on the Rubygem collection, such as displaying authors
  with most gems, or authors of the most downloaded gems, etc.

You have to reindex the elasticsearch index, to pick up the new mapping and index records properly:

    $ bundle exec rake environment tire:import CLASS='Rubygem' FORCE=1

See the following resources for information on previous efforts to implement a better Rubygems.org search:

* https://groups.google.com/forum/#!topic/gemcutter/xIzyTmFdXVo/discussion
* http://florianhanke.com/blog/2011/02/13/a-better-rubygems-search.html
* http://blog.websolr.com/post/3505941785/rubygems-search-upgrade-2
* http://blog.websolr.com/post/3505969969/rubygems-search-upgrade-3
aa63c84
Karel Minarik [SEARCH] Mock HTTP responses to Elasticsearch in unit tests 79b6857
Karel Minarik [SEARCH] Added a more complex search query in the SearchesController#…
…show method

Previously, we have been searching gems based on their names only.

With the new, more complex mapping defined in the preceding commit, we can add a more complex search query as well.

We're using a boolean query, keeping the original match prefix query and adding the `query_string` query,
which uses the Lucene query syntax (field specifation, boolean operators, wildcards,
fuzzy search, range and proximity searches, grouping, etc).

See:

* http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query.html
* http://lucene.apache.org/core/3_6_1/queryparsersyntax.html
415ba28
Karel Minarik [SEARCH] Added a `rescue_from` failed search requests due to incorrec…
…t query syntax

While we exposed the most powerful way of searching to the user (the Lucene query syntax),
it can quite easily lead to application errors when users enter incorrect queries, such as `bang!` or `foo[]`.

Since this is an error on the user's part, and not the application part, we should display a friendly
error explanation and give the user a chance to correct the query.
f0d20d4
Karel Minarik [SEARCH] Added a "user enters a search query with incorrect syntax" C…
…ucumber scenario

Since the application uses Cucumber scenarios for validating its proper operation,
a scenario with user entering an incorrect search query ("bang!") has been added.
5c13726
Karel Minarik [SEARCH] Added the "Search Advanced" Cucumber feature
With the complex queries now available to users of the application, we should add
acceptance tests for the common scenarios.

We'll start with searching in summaries and descriptions (thanks to the `_all` field
automatically generated by elasticsearch).

Use this command to run all search features:

    $ bundle exec cucumber --tag @search

Use this command to run the "advanced search" feature:

    $ bundle exec cucumber features/search_advanced.feature
ac96085
Karel Minarik [SEARCH] Added a Cucumber scenario for searching in gem authors
    Given we now have a more complex search available
    When we search in the `author` field
    We should get some relevant results

* Added a "Searching in authors" scenario
* Added a step for creating more complex Rubygem records into the `gem_steps.rb` definition file

Use this command to run the scenario:

    $ bundle exec cucumber --name "Searching in authors" features/search_advanced.feature
0346d62
Karel Minarik [SEARCH] Refactored the search steps to a higher-level nested step "I…
… search for ..."

Instead of repeating the low-level steps:

    When I go to the homepage
    And I fill in "query" with "<query>"
    And I press "Search"

over and over in our scenarios, we will abstract these steps to a single step:

    When I search for "<query>"

The obvious benefit is less code duplication and more readable steps.
4eb11a8
Karel Minarik [SEARCH] Added "search tips" sliding panel at the search results page
* Added a second form with `query` input, to duplicate the query for easier correction/change
  at the results page

* Added a HTML partial with concrete, interactive examples of queries possible with Lucene,
  hidden by default

* Added a link and JavaScript code to toggle the sliding panel with search examples

* Added CSS styling for the new elements, added a "help.png" icon from the FamFamFam suite
d389af0
Karel Minarik [SEARCH] Added starting of "elasticsearch" in the Travis CI configura…
…tion
01f479b
Karel Minarik [SEARCH] Prevent indexing errors on Rubygem records without a version cf5beec
Karel Minarik [SEARCH] Added information about installing Elasticsearch into "Contr…
…ibution Guidelines"
382b3a6
Karel Minarik [SEARCH] Handle search engine being not available in user-friendly way c1f99aa
Karel Minarik [SEARCH] Changed, that errors when indexing to Elasticsearch are rescued
Previously, when an error occurred while saving the model into the Elasticsearch index,
the whole operation failed and an Exception has been raised.

This patch adds a `rescue` clause which logs the exception and swallows it.
06f2626
Karel Minarik

@skottler Rebased, fixed problems with Webmock stubbing, force pushed.

Karel Minarik

I have terminated the EC2 instances for the demo application.

Jimmy Cuadra

For a project I'm working on, I'd like to be able to search gems based on gem specification metadata (the metadata hash attribute available from RubyGems 2.0 and up). I was investigating how gem searching is implemented currently, and after seeing that it was such a simple SQL query, figured someone had to be working on an ES-based search feature, and sure enough, here it is in this pull request.

Long story short, I'm very interested in seeing this rolled out and would like to help, since it's been sitting idle for quite some time. Is a code review still the blocker here?

Jimmy Cuadra

It also looks like the tire gem has been deprecated in favor of multiple gems hosted at elasticsearch/elasticsearch-ruby. This PR should be updated to use the new goods.

Karel Minarik karmi closed this December 11, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 22 unique commits by 1 author.

Sep 11, 2013
Karel Minarik [SEARCH] Added "tire" dependency for searching Rubygems.org with elas…
…ticsearch

elasticsearch is an open source search engine based on Lucene, with a RESTful HTTP interface and advanced distributed features.

Tire is a Ruby API/DSL for elasticsearch, with an out-of-the box ActiveRecord/ActiveModel integration.

See:

* "Tire": https://github.com/karmi/tire
* "elasticsearch": http://elasticsearch.org
e18cfb2
Karel Minarik [SEARCH] Allow connections to elasticsearch [localhost:9200] in tests…
… and Cucumber

NOTE: The `disable_net_connect!` call has to come *before* we load the application,
      because Tire checks for index existence on application boot, and shoots
      the entire test suite down.

      See <karmi/retire#136> for more information.
6632bc2
Karel Minarik [SEARCH] Added elementary Tire integration into the Rubygem model
* Added, that the Version model propagates touches to Rubygem [See: http://stackoverflow.com/a/11711477/95696]

* Added Tire ActiveRecord callbacks [See: https://github.com/karmi/tire#activemodel-integration]

* Added a simple mapping definition for Rubygem

* Added a simple `to_indexed_json` serialization for Tire

* Fixed incorrect test case in WebHookTest ("include an Authorization header"):
  1) use the `build`, not the `create` FactoryGirl strategy (to skip Tire indexing), and,
  2) use the _last_ HTTP request from WebMock registry (to skip Tire checking if the index exists)

* Fixed failing "Web Hooks" feature, using the _last_ HTTP request from WebMock registry (see above)

Import your current database with the following Rake task:

    $ bundle exec rake environment tire:import CLASS='Rubygem' FORCE=1

Check the index in your browser:

    <http://localhost:9200/development_rubygems/_search>
2e56fcb
Karel Minarik [SEARCH] Added the simplest possible search with elasticsearch
* Added simple query string search into SearchesController

* Recreate elasticsearch index in the SearchesController functional test setup and in the Cucumber `Before('@search')` callback

* Trigger index update in the FactoryGirl `after(:create)` callback

* Be more defensive in ApplicationHelper#short_info (in test, some gems don't have versions?)

Note: The "beer laser" => "beer_laser" Cucumber scenario fails,
      due to incorrect analysis of Gem names with underscore.
611774a
Karel Minarik [SEARCH] Added, that factories trigger `touch` callbacks after create
NOTE: We need to be absolutely sure the `Rubygem` instance is touched, because
      we rely on it being indexed in elasticsearch in integration tests.
70f0f1f
Karel Minarik [SEARCH] Added proper analyzer for Rubygem names
With the original (standard) analyzer, a Gem name like "url_mount" would be analyzed as "url_mount",
making searches for "url mount" (without the underscore) fail.

With the new analyzer, *tokens* are split by "special characters" defined by the `Patterns` module (.-_).

Try it out yourself:

  <http://localhost:9200/development_rubygems/_analyze?text=url_mount&field=name>

This change makes the "beer laser" => "beer_laser" Cucumber scenario pass.
2389438
Karel Minarik [SEARCH] Changed the search definition to a DSL-based syntax, added s…
…orting by downloads

* Used the DSL notation for defining the search: using a `match` prefix query on the "name" field,
  basically replicating the simple query string search with wildcards with a more performant version,
  and using a filter on the `indexed` property

* Added sorting of the results by downloads (descending)

* Added a Cucumber scenario for showing the more downloaded gems higher in search results

* Added supporting Cucumber code: a "I have a gem with downloads" and "I see these search results" step definitions
bfd2aa7
Karel Minarik [SEARCH] Changed, that search results are ordered first by downloads,…
… then alphabetically

* Changed the `name` property to multi-field, using the "keyword" analyzer on `name.raw` for searching
* Added the sort block with multiple sort fields
* Added a Cucumber scenario

NOTE: Now we should really stop and think twice about how to make the results more relevant.
      We should be able to get better search _precision_ by using the `Rubygem#downloads`
      counter as a factor affecting score, not just plainly sort on its value.
db6e1d3
Karel Minarik [SEARCH] Added a more complex mapping definition and serialization fo…
…r the Rubygem model

* Previously, only the `name`, `downloads` and `indexed` attributes were indexed,
  replicating the functionality of the current search feature.

* The `to_indexed_json` method was removed, relying on Tire's JSON serialization routines
  based on the model mapping definition.

* The `summary`, `description` and `author` gem properties were added, allowing much better
  search results _recall_, ie. allowing search in these fields as well and widening the search “net”.

* A gem which mentions "sinatra" in it's summary/description will now be matched (with a lower score):
  <http://localhost:3000/search?query=sinatra>.

* A gem written by Florian Hanke will now be matched: http://localhost:3000/search?query=florian+hanke

* The `version` gem property was added, allowing searches based on gem versions, for instance looking
  for Sinatra 1.3.2: <http://localhost:3000/search?query=name:sinatra+version:1.3.2>. For improved
  usability, the link from the result listing _should_ lead to the relevant version page,
  ie. http://localhost:3000/gems/sinatra/versions/1.3.2, not the last version.

* The `depends` and `uses` gem properties were added, which index runtime gem dependencies and all
  gem dependencies, respectively. It allows searches such as <http://localhost:3000/search?query=depends:rack>
  (for gems with depend on rack) or http://localhost:3000/search?query=uses:rack (for gems which use rack in
  one way or other).

* The `created_at` and `updated_at` gem properties were added, which allow to search gems updated in a specific
  period, for instance on August, 26th: <http://localhost:3000/search?query=updated_at:[2012-08-26+TO+2012-08-27]>

* The `author`, `created_at` and `updated_at` also allow for a _faceted navigation_ in the future, ie. searching
  for certain gem while restricting the result to certain author or time.

* These properties also allow for computing statistics on the Rubygem collection, such as displaying authors
  with most gems, or authors of the most downloaded gems, etc.

You have to reindex the elasticsearch index, to pick up the new mapping and index records properly:

    $ bundle exec rake environment tire:import CLASS='Rubygem' FORCE=1

See the following resources for information on previous efforts to implement a better Rubygems.org search:

* https://groups.google.com/forum/#!topic/gemcutter/xIzyTmFdXVo/discussion
* http://florianhanke.com/blog/2011/02/13/a-better-rubygems-search.html
* http://blog.websolr.com/post/3505941785/rubygems-search-upgrade-2
* http://blog.websolr.com/post/3505969969/rubygems-search-upgrade-3
aa63c84
Karel Minarik [SEARCH] Mock HTTP responses to Elasticsearch in unit tests 79b6857
Karel Minarik [SEARCH] Added a more complex search query in the SearchesController#…
…show method

Previously, we have been searching gems based on their names only.

With the new, more complex mapping defined in the preceding commit, we can add a more complex search query as well.

We're using a boolean query, keeping the original match prefix query and adding the `query_string` query,
which uses the Lucene query syntax (field specifation, boolean operators, wildcards,
fuzzy search, range and proximity searches, grouping, etc).

See:

* http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query.html
* http://lucene.apache.org/core/3_6_1/queryparsersyntax.html
415ba28
Karel Minarik [SEARCH] Added a `rescue_from` failed search requests due to incorrec…
…t query syntax

While we exposed the most powerful way of searching to the user (the Lucene query syntax),
it can quite easily lead to application errors when users enter incorrect queries, such as `bang!` or `foo[]`.

Since this is an error on the user's part, and not the application part, we should display a friendly
error explanation and give the user a chance to correct the query.
f0d20d4
Karel Minarik [SEARCH] Added a "user enters a search query with incorrect syntax" C…
…ucumber scenario

Since the application uses Cucumber scenarios for validating its proper operation,
a scenario with user entering an incorrect search query ("bang!") has been added.
5c13726
Karel Minarik [SEARCH] Added the "Search Advanced" Cucumber feature
With the complex queries now available to users of the application, we should add
acceptance tests for the common scenarios.

We'll start with searching in summaries and descriptions (thanks to the `_all` field
automatically generated by elasticsearch).

Use this command to run all search features:

    $ bundle exec cucumber --tag @search

Use this command to run the "advanced search" feature:

    $ bundle exec cucumber features/search_advanced.feature
ac96085
Karel Minarik [SEARCH] Added a Cucumber scenario for searching in gem authors
    Given we now have a more complex search available
    When we search in the `author` field
    We should get some relevant results

* Added a "Searching in authors" scenario
* Added a step for creating more complex Rubygem records into the `gem_steps.rb` definition file

Use this command to run the scenario:

    $ bundle exec cucumber --name "Searching in authors" features/search_advanced.feature
0346d62
Karel Minarik [SEARCH] Refactored the search steps to a higher-level nested step "I…
… search for ..."

Instead of repeating the low-level steps:

    When I go to the homepage
    And I fill in "query" with "<query>"
    And I press "Search"

over and over in our scenarios, we will abstract these steps to a single step:

    When I search for "<query>"

The obvious benefit is less code duplication and more readable steps.
4eb11a8
Karel Minarik [SEARCH] Added "search tips" sliding panel at the search results page
* Added a second form with `query` input, to duplicate the query for easier correction/change
  at the results page

* Added a HTML partial with concrete, interactive examples of queries possible with Lucene,
  hidden by default

* Added a link and JavaScript code to toggle the sliding panel with search examples

* Added CSS styling for the new elements, added a "help.png" icon from the FamFamFam suite
d389af0
Karel Minarik [SEARCH] Added starting of "elasticsearch" in the Travis CI configura…
…tion
01f479b
Karel Minarik [SEARCH] Prevent indexing errors on Rubygem records without a version cf5beec
Karel Minarik [SEARCH] Added information about installing Elasticsearch into "Contr…
…ibution Guidelines"
382b3a6
Karel Minarik [SEARCH] Handle search engine being not available in user-friendly way c1f99aa
Karel Minarik [SEARCH] Changed, that errors when indexing to Elasticsearch are rescued
Previously, when an error occurred while saving the model into the Elasticsearch index,
the whole operation failed and an Exception has been raised.

This patch adds a `rescue` clause which logs the exception and swallows it.
06f2626
This page is out of date. Refresh to see the latest.

Showing 45 changed files with 491 additions and 41 deletions. Show diff stats Hide diff stats

  1. 2  .travis.yml
  2. 18  CONTRIBUTING.md
  3. 1  Gemfile
  4. 11  Gemfile.lock
  5. 44  app/controllers/searches_controller.rb
  6. 2  app/helpers/application_helper.rb
  7. 7  app/helpers/searches_helper.rb
  8. 4  app/models/dependency.rb
  9. 2  app/models/linkset.rb
  10. 2  app/models/ownership.rb
  11. 53  app/models/rubygem.rb
  12. 2  app/models/subscription.rb
  13. 2  app/models/version.rb
  14. 43  app/views/searches/_search_tips.en.html.erb
  15. 10  app/views/searches/show.html.erb
  16. 3  config/environments/test.rb
  17. 2  config/locales/en.yml
  18. 46  features/search.feature
  19. 29  features/search_advanced.feature
  20. 24  features/step_definitions/gem_steps.rb
  21. 15  features/step_definitions/search_steps.rb
  22. 2  features/step_definitions/webhook_steps.rb
  23. 3  features/support/env.rb
  24. 9  features/support/gemcutter.rb
  25. BIN  public/images/help.png
  26. 13  public/javascripts/application.js
  27. 78  public/stylesheets/screen.css
  28. 10  test/factories.rb
  29. 16  test/functional/searches_controller_test.rb
  30. 4  test/test_helper.rb
  31. 5  test/unit/dependencies_middleware_test.rb
  32. 5  test/unit/dependency_test.rb
  33. 5  test/unit/download_test.rb
  34. 5  test/unit/helpers/rubygems_helper_test.rb
  35. 5  test/unit/hostess_test.rb
  36. 5  test/unit/ownership_test.rb
  37. 5  test/unit/pusher_test.rb
  38. 12  test/unit/rubygem_test.rb
  39. 5  test/unit/subscription_test.rb
  40. 5  test/unit/user_test.rb
  41. 5  test/unit/version_test.rb
  42. 13  test/unit/web_hook_test.rb
  43. BIN  vendor/cache/ansi-1.4.3.gem
  44. BIN  vendor/cache/hashr-0.0.22.gem
  45. BIN  vendor/cache/tire-0.6.0.gem
2  .travis.yml
@@ -9,3 +9,5 @@ language: ruby
9 9
 rvm:
10 10
   - 1.9.3
11 11
 script: bundle exec rake default
  12
+services:
  13
+  - elasticsearch
18  CONTRIBUTING.md
Source Rendered
@@ -43,7 +43,7 @@ git remote set-url origin git@github.com:rubygems/rubygems.org.git
43 43
 
44 44
 Otherwise, you can continue to hack away in your own fork.
45 45
 
46  
-If you’re looking for things to hack on, please check 
  46
+If you’re looking for things to hack on, please check
47 47
 [GitHub Issues](http://github.com/rubygems/rubygems.org/issues). If you’ve
48 48
 found bugs or have feature ideas don’t be afraid to pipe up and ask the
49 49
 [mailing list](http://groups.google.com/group/gemcutter) or IRC channel
@@ -127,19 +127,23 @@ running:
127 127
     **version 2.0 or higher**. If you have homebrew,
128 128
     do `brew install redis -H`, if you use macports,
129 129
     do `sudo port install redis`.
  130
+* Install [Elasticsearch](http://www.elasticsearch.org).
  131
+    You can do it with `brew install elasticsearch`,
  132
+    or just download, unzip and run
  133
+    a [release](http://www.elasticsearch.org/download/).
130 134
 * Rubygems is configured to use PostgreSQL (>= 8.4.x),
131 135
     for MySQL see below. Install with: `brew install postgres`
132 136
 
133 137
 **Get the code:**
134 138
 
135 139
 * Clone the repo: `git clone git://github.com/rubygems/rubygems.org`
136  
-* Move into your cloned rubygems directory if you haven’t already: 
  140
+* Move into your cloned rubygems directory if you haven’t already:
137 141
     `cd rubygems.org`
138  
-    
  142
+
139 143
 **Setup the database:**
140 144
 
141 145
 * Get set up: `./script/setup`
142  
-* Run the database rake tasks if needed: 
  146
+* Run the database rake tasks if needed:
143 147
     `rake db:create:all db:drop:all db:setup db:test:prepare --trace`
144 148
 
145 149
 **Running tests:**
@@ -151,7 +155,7 @@ running:
151 155
 
152 156
 * Set the REDISTOGO_URL environment variable. For example:
153 157
     `REDISTOGO_URL="redis://localhost:6379"`
154  
-* Import gems if you want to seed the database. 
  158
+* Import gems if you want to seed the database.
155 159
     `rake gemcutter:import:process PATHTO_GEMS/cache`
156 160
     * _To import a small set of gems you can point the import process to any
157 161
         gems cache directory, like a very small `rvm` gemset for instance._
@@ -188,8 +192,8 @@ running:
188 192
 
189 193
 > **Warning:** Gem names are case sensitive (eg. `BlueCloth` vs.
190 194
 > `bluecloth` 2). MySQL has a `utf8_bin` collation, but it appears
191  
-> that you still need to do `BINARY name = ?` for searching. 
192  
-> It is recommended that you stick to PostgreSQL >= 8.4.x 
  195
+> that you still need to do `BINARY name = ?` for searching.
  196
+> It is recommended that you stick to PostgreSQL >= 8.4.x
193 197
 > for development. Some tests will also fail if you use MySQL
194 198
 > because some queries use SQL functions which don't exist in MySQL..
195 199
 
1  Gemfile
@@ -31,6 +31,7 @@ gem 'validates_formatting_of'
31 31
 gem 'will_paginate'
32 32
 gem 'xml-simple'
33 33
 gem 'yajl-ruby', :require => 'yajl'
  34
+gem 'tire'
34 35
 
35 36
 # enable if on heroku, make sure to toss this into an initializer:
36 37
 #     Rails.application.config.middleware.use HerokuAssetCacher
11  Gemfile.lock
@@ -30,6 +30,7 @@ GEM
30 30
       multi_json (~> 1.0)
31 31
     addressable (2.3.5)
32 32
     aggregate (0.2.2)
  33
+    ansi (1.4.3)
33 34
     arel (3.0.2)
34 35
     bcrypt-ruby (3.1.2)
35 36
     bluepill (0.0.66)
@@ -101,6 +102,7 @@ GEM
101 102
     gherkin (2.12.1)
102 103
       multi_json (~> 1.3)
103 104
     gravtastic (3.2.6)
  105
+    hashr (0.0.22)
104 106
     high_voltage (1.2.4)
105 107
     highline (1.6.19)
106 108
     hike (1.2.3)
@@ -212,6 +214,14 @@ GEM
212 214
     thor (0.18.1)
213 215
     tilt (1.4.1)
214 216
     timecop (0.6.3)
  217
+    tire (0.6.0)
  218
+      activemodel (>= 3.0)
  219
+      activesupport
  220
+      ansi
  221
+      hashr (~> 0.0.19)
  222
+      multi_json (~> 1.3)
  223
+      rake
  224
+      rest-client (~> 1.6)
215 225
     treetop (1.4.15)
216 226
       polyglot
217 227
       polyglot (>= 0.3.1)
@@ -279,6 +289,7 @@ DEPENDENCIES
279 289
   shoulda
280 290
   sinatra
281 291
   timecop
  292
+  tire
282 293
   unicorn
283 294
   validates_formatting_of
284 295
   webmock
44  app/controllers/searches_controller.rb
... ...
@@ -1,11 +1,49 @@
1 1
 class SearchesController < ApplicationController
2 2
 
  3
+  # Handle search engine not being available
  4
+  #
  5
+  rescue_from Errno::EHOSTUNREACH, Errno::ECONNREFUSED, SocketError do |error|
  6
+    flash.now[:failure] = "Sorry, search is not available at the moment." if params[:query]
  7
+    render :show, :status => :internal_server_error
  8
+  end
  9
+
  10
+  # Indicate incorrect query to the user
  11
+  #
  12
+  rescue_from Tire::Search::SearchRequestFailed do |error|
  13
+    flash.now[:failure] = "Sorry, your query is incorrect." if error.message =~ /SearchParseException/ && params[:query]
  14
+    render :show, :status => :internal_server_error
  15
+  end
  16
+
3 17
   def show
4  
-    if params[:query]
5  
-      @gems = Rubygem.search(params[:query]).with_versions.paginate(:page => params[:page])
  18
+    if params[:query].present?
  19
+      @gems = Rubygem.tire.search :page     => params[:page],
  20
+                                  :per_page => Rubygem.per_page,
  21
+                                  :load     => {:include => 'versions'} do |search|
  22
+
  23
+        search.query do |s|
  24
+          s.filtered do |f|
  25
+            f.query  do |q|
  26
+              q.boolean do |it|
  27
+                it.should { |q| q.match 'name.raw', params[:query], :boost => 500 }
  28
+                it.should { |q| q.match :name, params[:query], :type => 'phrase_prefix', :operator => 'and', :boost => 100 }
  29
+                it.should { |q| q.string params[:query], :default_operator => 'and' }
  30
+              end
  31
+            end
  32
+            f.filter :term, :indexed => true
  33
+          end
  34
+        end
  35
+
  36
+        search.sort   do
  37
+          by 'downloads', :desc
  38
+          by 'name.raw',  :asc
  39
+        end
  40
+
  41
+        # STDOUT.puts search.to_curl if Rails.env.development?
  42
+      end
  43
+
6 44
       @exact_match = Rubygem.name_is(params[:query]).with_versions.first
7 45
 
8  
-      redirect_to rubygem_path(@exact_match) if @gems == [@exact_match]
  46
+      redirect_to rubygem_path(@exact_match) if @exact_match && @gems.size == 1 && @gems.first.id == @exact_match.id
9 47
     end
10 48
   end
11 49
 
2  app/helpers/application_helper.rb
@@ -16,7 +16,7 @@ def atom_feed_link(title, url)
16 16
   end
17 17
 
18 18
   def short_info(version)
19  
-    truncate(version.info, :length => 100)
  19
+    version ? truncate(version.info, :length => 100) : ''
20 20
   end
21 21
 
22 22
   def gravatar(size, id = "gravatar", user = current_user)
7  app/helpers/searches_helper.rb
... ...
@@ -0,0 +1,7 @@
  1
+module SearchesHelper
  2
+
  3
+  def link_to_example_search(query)
  4
+    link_to query, search_url( :query => query, :anchor => 'tips' )
  5
+  end
  6
+
  7
+end
4  app/models/dependency.rb
... ...
@@ -1,8 +1,8 @@
1 1
 class Dependency < ActiveRecord::Base
2 2
   LIMIT = 250
3 3
 
4  
-  belongs_to :rubygem
5  
-  belongs_to :version
  4
+  belongs_to :rubygem, :touch => true
  5
+  belongs_to :version, :touch => true
6 6
 
7 7
   before_validation :use_gem_dependency,
8 8
                     :use_existing_rubygem,
2  app/models/linkset.rb
... ...
@@ -1,5 +1,5 @@
1 1
 class Linkset < ActiveRecord::Base
2  
-  belongs_to :rubygem
  2
+  belongs_to :rubygem, :touch => true
3 3
   attr_protected :rubygem_id
4 4
 
5 5
   LINKS = %w(home wiki docs mail code bugs).freeze
2  app/models/ownership.rb
... ...
@@ -1,5 +1,5 @@
1 1
 class Ownership < ActiveRecord::Base
2  
-  belongs_to :rubygem
  2
+  belongs_to :rubygem, :touch => true
3 3
   belongs_to :user
4 4
 
5 5
   validates :user_id, :uniqueness => {:scope => :rubygem_id}
53  app/models/rubygem.rb
... ...
@@ -1,6 +1,8 @@
1 1
 class Rubygem < ActiveRecord::Base
2 2
   include Patterns
3 3
 
  4
+  include Tire::Model::Search
  5
+
4 6
   has_many :owners, :through => :ownerships, :source => :user
5 7
   has_many :ownerships, :dependent => :destroy
6 8
   has_many :subscribers, :through => :subscriptions, :source => :user
@@ -15,6 +17,50 @@ class Rubygem < ActiveRecord::Base
15 17
   after_create :update_unresolved
16 18
   before_destroy :mark_unresolved
17 19
 
  20
+  after_create  :update_elasticsearch_index_with_rescue
  21
+  after_destroy :update_elasticsearch_index_with_rescue
  22
+  after_touch   :update_elasticsearch_index_with_rescue
  23
+
  24
+  tire do
  25
+    index_prefix Rails.env
  26
+
  27
+    settings :number_of_shards   => 1,
  28
+             :number_of_replicas => 1,
  29
+             :analysis           => {
  30
+               :analyzer => {
  31
+                 :rubygem => {
  32
+                   :type => 'pattern',
  33
+                   :pattern => "[\s#{Regexp.escape(SPECIAL_CHARACTERS)}]+"
  34
+                 }
  35
+               }
  36
+             } do
  37
+      mapping do
  38
+        indexes :name,      :type => 'multi_field',
  39
+                            :fields => {
  40
+                              :name => { :type => 'string', :analyzer => 'rubygem', :boost => 10.0 },
  41
+                              :raw  => { :type => 'string', :analyzer => 'keyword', :boost => 10.0 }
  42
+                            }
  43
+        indexes :indexed,   :type => 'boolean', :include_in_all => false, :as => proc { versions.any?(&:indexed?) }
  44
+        indexes :downloads, :type => 'integer', :include_in_all => false
  45
+
  46
+        indexes :summary,     :analyzer => 'english', :as => proc { versions.most_recent.try(:summary) }
  47
+        indexes :description, :analyzer => 'english', :as => proc { versions.most_recent.try(:description) }
  48
+        indexes :author,      :as => proc { versions.most_recent.try(:authors).try(:split, /\s*,\s*/) }
  49
+
  50
+        indexes :version,     :analyzer => 'keyword', :as => proc { versions.map(&:number) },
  51
+                              :include_in_all => false
  52
+
  53
+        indexes :uses,        :as => proc { versions.most_recent.dependencies.map(&:name) if versions.most_recent rescue nil },
  54
+                              :include_in_all => false
  55
+        indexes :depends,     :as => proc { versions.most_recent.dependencies.runtime.map(&:name) if versions.most_recent rescue nil },
  56
+                              :include_in_all => false
  57
+
  58
+        indexes :created_at,  :type => 'date', :include_in_all => false
  59
+        indexes :updated_at,  :type => 'date', :include_in_all => false
  60
+      end
  61
+    end
  62
+  end
  63
+
18 64
   def self.with_versions
19 65
     where("rubygems.id IN (SELECT rubygem_id FROM versions where versions.indexed IS true)")
20 66
   end
@@ -268,6 +314,13 @@ def gittip_enabled?
268 314
     owners.where('gittip_username is not null').count > 0
269 315
   end
270 316
 
  317
+  def update_elasticsearch_index_with_rescue
  318
+    update_elasticsearch_index
  319
+  rescue Exception => e
  320
+    Rails.logger.error "Error when updating Elasticsearch. Original exception: #{e.inspect}"
  321
+    return true
  322
+  end
  323
+
271 324
   private
272 325
 
273 326
   def ensure_name_format
2  app/models/subscription.rb
... ...
@@ -1,5 +1,5 @@
1 1
 class Subscription < ActiveRecord::Base
2  
-  belongs_to :rubygem
  2
+  belongs_to :rubygem, :touch => true
3 3
   belongs_to :user
4 4
 
5 5
   validates :rubygem_id, :uniqueness => {:scope => :user_id}
2  app/models/version.rb
... ...
@@ -1,5 +1,5 @@
1 1
 class Version < ActiveRecord::Base
2  
-  belongs_to :rubygem
  2
+  belongs_to :rubygem, :touch => true
3 3
   has_many :dependencies, :order => 'rubygems.name ASC', :include => :rubygem, :dependent => :destroy
4 4
 
5 5
   before_save      :update_prerelease
43  app/views/searches/_search_tips.en.html.erb
... ...
@@ -0,0 +1,43 @@
  1
+<div id="search-tips">
  2
+<div>
  3
+  <p>
  4
+    When looking for gems, you can use a wide variety of search queries
  5
+    in the <a href="http://lucene.apache.org/core/3_6_1/queryparsersyntax.html" class="external">Lucene syntax</a>.
  6
+  </p>
  7
+
  8
+  <p>
  9
+    Quite simply, you can search in gem names, summaries and descriptions with queries like
  10
+    <code><%= link_to_example_search 'rack' %></code> or
  11
+    <code><%= link_to_example_search 'imap' %></code>
  12
+  </p>
  13
+
  14
+  <p>You can, of course, restrict the search to gem names only:</p>
  15
+  <p><code><%= link_to_example_search 'name:rack' %></code></p>
  16
+
  17
+  <p>To broaden your search, you can use wildcards:</p>
  18
+  <p>
  19
+    <code><%= link_to_example_search 'name:ra*' %></code> or
  20
+    <code><%= link_to_example_search 'web*' %></code>
  21
+  </p>
  22
+
  23
+  <p>You can search for specific gem authors:</p>
  24
+  <p><code><%= link_to_example_search 'author:john' %></code></p>
  25
+
  26
+  <p>Of course, you can combine these queries into complex ones:</p>
  27
+  <p>
  28
+    <code><%= link_to_example_search 'name:ra* AND author:john' %></code> or
  29
+    <code><%= link_to_example_search 'name:ra* AND version:1*' %></code>
  30
+  </p>
  31
+
  32
+  <p>To discover more gems, you can search by their depencies in runtime:</p>
  33
+  <p><code><%= link_to_example_search 'depends:rack' %></code></p>
  34
+  <p>or in development:</p>
  35
+  <p><code><%= link_to_example_search 'uses:rack' %></code></p>
  36
+
  37
+  <p>Lastly, you can restrict your search to gems created or updated in certain timeframe:</p>
  38
+  <p><code><%= link_to_example_search "name:rack AND updated_at:[#{Time.now.to_date.beginning_of_month.to_s(:db)} TO #{Time.now.to_date.end_of_month.to_s(:db)}]" %></code></p>
  39
+
  40
+  <p class="legend">The searchable fields are <em>name</em>, <em>summary</em>, <em>description</em>, <em>author</em>, <em>version</em>, <em>uses</em>, <em>depends</em>, <em>created_at</em>, <em>updated_at</em> and <em>downloads</em>.</p>
  41
+
  42
+</div>
  43
+</div>
10  app/views/searches/show.html.erb
... ...
@@ -1,6 +1,14 @@
1 1
 <% @title = "search" %>
  2
+
  3
+<% @subtitle = t('.subtitle', :query => nil) if params[:query].present? %>
  4
+<%= form_tag search_url, :id => "in-page-search", :method => :get do %>
  5
+  <%= text_field_tag :query, params[:query] if params[:query].present? %>
  6
+  <a href="#" id="search-tips-toggle" title="<%= t '.tips_tooltip' %>"><%= t '.tips' %></a>
  7
+<% end %>
  8
+
  9
+<%= render :partial => 'search_tips' %>
  10
+
2 11
 <% if @gems %>
3  
-  <% @subtitle = t('.subtitle', :query => content_tag(:em, h(params[:query]))) %>
4 12
   <% if @exact_match %>
5 13
     <p><%= t '.exact_match' %></p>
6 14
     <div class="gems border">
3  config/environments/test.rb
... ...
@@ -1,3 +1,6 @@
  1
+require 'webmock'          # Allow connections to elasticsearch
  2
+WebMock.disable_net_connect!(:allow => /localhost\:9200/)
  3
+
1 4
 Gemcutter::Application.configure do
2 5
   config.cache_classes = true
3 6
   config.whiny_nils = true
2  config/locales/en.yml
@@ -166,6 +166,8 @@ en:
166 166
     show:
167 167
       subtitle: "for %{query}"
168 168
       exact_match: Exact match
  169
+      tips: Tips
  170
+      tips_tooltip: "Show search tips"
169 171
 
170 172
   sessions:
171 173
     new:
46  features/search.feature
... ...
@@ -1,3 +1,5 @@
  1
+@search
  2
+
1 3
 Feature: Search
2 4
   In order to find a gem I want
3 5
   As a ruby developer
@@ -11,11 +13,8 @@ Feature: Search
11 13
       | name: twitter     | social junk  |
12 14
       | name: twitter-cli | command line |
13 15
       | name: beer_laser  | amazing beer |
14  
-    When I go to the homepage
15  
-    And I fill in "query" with "<query>"
16  
-    And I press "Search"
17  
-    Then I should see "search for <query>"
18  
-    And I should see "<result>"
  16
+    When I search for "<query>"
  17
+    Then I should see "<result>"
19 18
 
20 19
     Examples:
21 20
       | query      | result       |
@@ -29,9 +28,7 @@ Feature: Search
29 28
     Given the following version exists:
30 29
       | rubygem              | description |
31 30
       | name: foos-paperclip | paperclip   |
32  
-    When I go to the homepage
33  
-    And I fill in "query" with "paperclip"
34  
-    And I press "Search"
  31
+    When I search for "paperclip"
35 32
     Then I should not see "Exact match"
36 33
     But I should see "foos-paperclip"
37 34
 
@@ -48,9 +45,7 @@ Feature: Search
48 45
     Given the following version exists:
49 46
       | rubygem    | number | indexed |
50 47
       | name: RGem | 1.0.0  | false   |
51  
-    When I go to the homepage
52  
-    And I fill in "query" with "RGem"
53  
-    And I press "Search"
  48
+    When I search for "RGem"
54 49
     Then I should not see "RGem (1.0.0)"
55 50
 
56 51
   Scenario: The most recent version of a gem is yanked
@@ -59,8 +54,29 @@ Feature: Search
59 54
       | name: RGem  | 1.2.1  | true    |
60 55
       | name: RGem  | 1.2.2  | false   |
61 56
       | name: RGem2 | 2.0.0  | true    |
62  
-    When I go to the homepage
63  
-    And I fill in "query" with "RGem"
64  
-    And I press "Search"
65  
-    And I should see "RGem (1.2.1)"
  57
+    When I search for "RGem"
  58
+    Then I should see "RGem (1.2.1)"
66 59
     And I should not see "RGem (1.2.2)"
  60
+
  61
+  Scenario: The most downloaded gem is listed first
  62
+    Given a rubygem "Cereal-Bowl" exists with version "0.0.1" and 500 downloads
  63
+    And a rubygem "Cereal" exists with version "0.0.9" and 5 downloads
  64
+    When I search for "cereal"
  65
+    Then I should see these search results:
  66
+      | Cereal-Bowl (0.0.1) |
  67
+      | Cereal (0.0.9)      |
  68
+
  69
+  Scenario: The most downloaded gem is listed first and the rest of results is ordered alphabetically
  70
+    Given a rubygem "Straight-F" exists with version "0.0.1" and 10 downloads
  71
+    And a rubygem "Straight-B" exists with version "0.0.1" and 0 downloads
  72
+    And a rubygem "Straight-A" exists with version "0.0.1" and 0 downloads
  73
+    When I search for "straight"
  74
+    Then I should see these search results:
  75
+      | Straight-F (0.0.1) |
  76
+      | Straight-A (0.0.1) |
  77
+      | Straight-B (0.0.1) |
  78
+
  79
+  Scenario: The user enters a search query with incorrect syntax
  80
+    When I search for "bang!"
  81
+    Then I should not see /Displaying.*Rubygem/
  82
+    But I should see "Sorry, your query is incorrect."
29  features/search_advanced.feature
... ...
@@ -0,0 +1,29 @@
  1
+@search
  2
+
  3
+Feature: Search Advanced
  4
+  In order to discover more gems
  5
+  As a Ruby developer
  6
+  I should be able to use advanced search on gemcutter
  7
+
  8
+  Scenario: Search in summaries and descriptions
  9
+    Given the following versions exist:
  10
+      | rubygem        | number | summary                                   | description             |
  11
+      | name: sinatra  | 0.0.1  | Sinatra is a DSL ...                      |                         |
  12
+      | name: vegas    | 0.0.1  | executable versions ... Sinatra/Rack apps |                         |
  13
+      | name: capybara | 0.0.1  |                                           | ... testing Sinatra ... |
  14
+    When I search for "sinatra"
  15
+    Then I should see these search results:
  16
+      | capybara (0.0.1) |
  17
+      | sinatra (0.0.1)  |
  18
+      | vegas (0.0.1)    |
  19
+
  20
+Scenario: Searching in authors
  21
+    Given gems with these properties exist:
  22
+      | name     | version | authors                        | downloads |
  23
+      | sinatra  | 0.0.1   | Blake Mizerany, Ryan Tomayko   | 500       |
  24
+      | beefcake | 0.0.1   | Blake Mizerany                 | 50        |
  25
+      | vegas    | 0.0.1   | Aaron Quint                    | 5         |
  26
+    When I search for "author:blake"
  27
+    Then I should see these search results:
  28
+      | sinatra (0.0.1)   |
  29
+      | beefcake (0.0.1)  |
24  features/step_definitions/gem_steps.rb
@@ -19,6 +19,11 @@
19 19
   create(:version, :rubygem => rubygem, :number => version_number)
20 20
 end
21 21
 
  22
+Given /^a rubygem "([^\"]*)" exists with version "([^\"]*)" and (\d+) downloads$/ do |name, version, downloads|
  23
+  rubygem = create(:rubygem_with_downloads, :name => name, downloads: downloads)
  24
+  create(:version, :rubygem => rubygem, :number => version)
  25
+end
  26
+
22 27
 Given /^I have a gem "([^\"]*)" with version "([^\"]*)" and homepage "([^\"]*)"$/ do |name, version, homepage|
23 28
   gemspec = new_gemspec(name, version, "Gemcutter", "ruby")
24 29
   gemspec.homepage = homepage
@@ -65,3 +70,22 @@
65 70
     rubygem.ownerships.create :user => user
66 71
   end
67 72
 end
  73
+
  74
+Given /^gems with these properties exist:$/ do |table|
  75
+  table.hashes.each do |row|
  76
+    if row['downloads']
  77
+      rubygem = FactoryGirl.create :rubygem_with_downloads, :name => row['name'], :downloads => row['downloads']
  78
+    else
  79
+      rubygem = FactoryGirl.create :rubygem, :name => row['name']
  80
+    end
  81
+
  82
+    FactoryGirl.create(:version, :rubygem => rubygem) do |version|
  83
+      version.number      = row['version']
  84
+      version.authors     = row['authors'].split(/\s*,\s*/)
  85
+      version.summary     = row['summary']
  86
+      version.description = row['description']
  87
+
  88
+      version.save
  89
+    end
  90
+  end
  91
+end
15  features/step_definitions/search_steps.rb
... ...
@@ -0,0 +1,15 @@
  1
+When /^I search for "([^"]*)"$/ do |query|
  2
+  steps %{
  3
+    When I go to the homepage
  4
+    And I fill in "query" with "#{query}"
  5
+    And I press "Search"
  6
+  }
  7
+end
  8
+
  9
+Then /^I should see these search results:$/ do |expected_table|
  10
+  # TODO: Make less brittle with an explicit CSS class in the view
  11
+  results  = page.all(".gems:last-child li a strong").collect(&:text)
  12
+
  13
+  assert_not_empty results
  14
+  expected_table.diff! Cucumber::Ast::Table.new( results.map { |r| Array(r) } )
  15
+end
2  features/step_definitions/webhook_steps.rb
@@ -18,7 +18,7 @@
18 18
 Then /^the webhook "([^\"]*)" should receive a POST with gem "([^\"]*)" at version "([^\"]*)"$/ do |web_hook_url, gem_name, version_number|
19 19
   WebMock.assert_requested(:post, web_hook_url, :times => 1)
20 20
 
21  
-  request = WebMock::RequestRegistry.instance.requested_signatures.hash.keys.first
  21
+  request = WebMock::RequestRegistry.instance.requested_signatures.hash.keys.last
22 22
   json = MultiJson.load(request.body)
23 23
 
24 24
   assert_equal gem_name, json["name"]
3  features/support/env.rb
@@ -4,6 +4,9 @@
4 4
 # instead of editing this one. Cucumber will automatically load all features/**/*.rb
5 5
 # files.
6 6
 
  7
+require 'webmock/cucumber' # Allow connections to elasticsearch
  8
+WebMock.disable_net_connect!(:allow => /localhost\:9200/)
  9
+
7 10
 require 'cucumber/rails'
8 11
 
9 12
 # Capybara defaults to XPath selectors rather than Webrat's default of CSS3. In
9  features/support/gemcutter.rb
... ...
@@ -1,6 +1,3 @@
1  
-require 'webmock'
2  
-WebMock.disable_net_connect!
3  
-
4 1
 Hostess.local = true
5 2
 Capybara.app_host = "https://gemcutter.local"
6 3
 
@@ -19,3 +16,9 @@
19 16
   FileUtils.rm_rf(TEST_DIR)
20 17
   $redis.flushdb
21 18
 end
  19
+
  20
+Before('@search') do |s|
  21
+  Rails.logger.debug "[TIRE] Recreating the elasticsearch index"
  22
+  Rubygem.tire.index.delete
  23
+  Rubygem.tire.create_elasticsearch_index
  24
+end
BIN  public/images/help.png
13  public/javascripts/application.js
@@ -2,4 +2,17 @@ $(document).ready(function() {
2 2
   $('#version_for_stats').change(function() {
3 3
     window.location.href = $(this).val();
4 4
   });
  5
+
  6
+  if (window.location.hash != '#tips') { $('#search-tips').hide(); }
  7
+  $('#search-tips-toggle').click(function(e) {
  8
+    e.preventDefault();
  9
+    var o = $('#search-tips');
  10
+    if ( o.is(':visible') ) {
  11
+      o.hide('fast');
  12
+      window.location.hash = '';
  13
+    } else {
  14
+      o.show('fast');
  15
+      window.location.hash = '#tips';
  16
+    }
  17
+  });
5 18
 });
78  public/stylesheets/screen.css
@@ -1731,3 +1731,81 @@ h5#downloads {
1731 1731
   font-weight: bold;
1732 1732
   padding-left: 24px;
1733 1733
 }
  1734
+
  1735
+/* In page search */
  1736
+#in-page-search {
  1737
+  margin: 0 0 0 1em;
  1738
+  padding: 0;
  1739
+  display: inline-block;
  1740
+}
  1741
+
  1742
+#in-page-search input[type="text"] {
  1743
+  padding: 0.5em 0.5em 0.75em 0.5em;
  1744
+  border: 1px solid rgba(0,0,0,0.15);
  1745
+  background: rgba(255, 255, 255, 0.6);
  1746
+  border-radius: 0.5em;
  1747
+  width: 41em;
  1748
+  bottom: 0.2em;
  1749
+  position: relative;
  1750
+}
  1751
+
  1752
+#in-page-search input[type="text"]:focus {
  1753
+  background: rgba(255, 255, 255, 0.9);
  1754
+}
  1755
+
  1756
+#search-tips-toggle {
  1757
+  color: #6b604a;
  1758
+  background: url('/images/help.png') 0 0px no-repeat;
  1759
+  padding: 4px 0 0 20px;
  1760
+  margin: 0 0 0 0.25em;
  1761
+  height: 20px;
  1762
+  display: inline-block;
  1763
+  bottom: 0.25em;
  1764
+  position: relative;
  1765
+}
  1766
+#search-tips-toggle:hover {
  1767
+  color: #300000;
  1768
+  text-decoration: underline;
  1769
+}
  1770
+
  1771
+#search-tips {
  1772
+  font-size: 80%;
  1773
+  background: rgba(255, 255, 255, 0.4);
  1774
+  border-top: 1px solid rgba(0,0,0,0.1);
  1775
+  border-bottom: 1px solid rgba(0,0,0,0.1);
  1776
+  margin: 2em 0 0 0;
  1777
+  padding: 1.75em 0 1em 0;
  1778
+}
  1779
+
  1780
+#search-tips p {
  1781
+  margin-top: 0;
  1782
+  margin-bottom: 0.5em;
  1783
+}
  1784
+
  1785
+#search-tips a.external {
  1786
+  text-decoration: underline;
  1787
+}
  1788
+
  1789
+#search-tips code {
  1790
+  color: #5E543E;
  1791
+  background: rgba(255, 255, 0, 0.4);
  1792
+  padding: 0.25em 0.75em 0.25em 0.75em;
  1793
+}
  1794
+#search-tips code:hover {
  1795
+ color: #fff;
  1796
+ background: rgba(0, 0, 0, 0.6);
  1797
+}
  1798
+
  1799
+#search-tips code a {
  1800
+  color: #5E543E;
  1801
+}
  1802
+#search-tips code:hover a {
  1803
+  color: #fff;
  1804
+}
  1805
+
  1806
+#search-tips .legend {
  1807
+  color: #85775b;
  1808
+  border-top: 1px solid rgba(0,0,0,0.05);
  1809
+  margin-top: 1.5em;
  1810
+  padding-top: 1em;
  1811
+}
10  test/factories.rb
@@ -62,6 +62,11 @@
62 62
     linkset
63 63
     name
64 64
 
  65
+    after(:create) do |this|
  66
+      this.touch
  67
+      Rubygem.tire.index.refresh
  68
+    end
  69
+
65 70
     factory :rubygem_with_downloads do
66 71
       after(:create) do |r|
67 72
         $redis[Download.key(r)] = r['downloads']
@@ -84,6 +89,11 @@
84 89
     requirements "Opencv"
85 90
     rubygem
86 91
     size 1024
  92
+
  93
+    after(:create) do |this|
  94
+      this.rubygem.touch
  95
+      Rubygem.tire.index.refresh
  96
+    end
87 97
   end
88 98
 
89 99
   factory :version_history do
16  test/functional/searches_controller_test.rb
... ...
@@ -1,6 +1,11 @@
1 1
 require 'test_helper'
2 2
 
3 3
 class SearchesControllerTest < ActionController::TestCase
  4
+  def setup
  5
+    super
  6
+    Rubygem.tire.index.delete
  7
+    Rubygem.tire.create_elasticsearch_index
  8
+  end
4 9
 
5 10
   context 'on GET to show with no search parameters' do
6 11
     setup { get :show }
@@ -60,4 +65,15 @@ class SearchesControllerTest < ActionController::TestCase
60 65
     should respond_with :redirect
61 66
     should redirect_to('the gem') { rubygem_path(@sinatra) }
62 67
   end
  68
+
  69
+  context 'on GET to show with bad search query' do
  70
+    setup { get :show, :query => 'bang!' }
  71
+
  72
+    should respond_with :internal_server_error
  73
+    should render_template :show
  74
+    should set_the_flash.now[:failure].to /query is incorrect/
  75
+    should "see no results" do
  76
+      assert ! page.has_content?("Results")
  77
+    end
  78
+  end
63 79
 end
4  test/test_helper.rb
... ...
@@ -1,4 +1,8 @@
1 1
 ENV["RAILS_ENV"] = "test"
  2
+
  3
+require 'webmock/test_unit' # Allow connections to elasticsearch
  4
+WebMock.disable_net_connect!(:allow => /localhost\:9200/)
  5
+
2 6
 require File.expand_path('../../config/environment', __FILE__)
3 7
 require 'rails/test_help'
4 8
 
5  test/unit/dependencies_middleware_test.rb
... ...
@@ -1,6 +1,11 @@
1 1
 require 'test_helper'
2 2
 
3 3
 class DependenciesMiddlewareTest < ActiveSupport::TestCase
  4
+  def setup
  5
+    super
  6
+    WebMock.stub_request(:any, /.*localhost:9200.*/).to_return(:body => '{}', :status => 200)
  7
+  end
  8
+
4 9
   def app
5 10
     V1MarshaledDepedencies.new
6 11
   end
5  test/unit/dependency_test.rb
... ...
@@ -1,6 +1,11 @@
1 1
 require 'test_helper'
2 2
 
3 3
 class DependencyTest < ActiveSupport::TestCase
  4
+  def setup
  5
+    super
  6
+    WebMock.stub_request(:any, /.*localhost:9200.*/).to_return(:body => '{}', :status => 200)
  7
+  end
  8
+
4 9
   should belong_to :rubygem
5 10
   should belong_to :version
6 11
 
5  test/unit/download_test.rb
... ...
@@ -1,6 +1,11 @@
1 1
 require 'test_helper'
2 2
 
3 3
 class DownloadTest < ActiveSupport::TestCase
  4
+  def setup
  5
+    super
  6
+    WebMock.stub_request(:any, /.*localhost:9200.*/).to_return(:body => '{}', :status => 200)
  7
+  end
  8
+
4 9
   should "load up all downloads with just raw strings and process them" do
5 10
     rubygem = create(:rubygem, :name => "some-stupid13-gem42-9000")
6 11
     version = create(:version, :rubygem => rubygem)
5  test/unit/helpers/rubygems_helper_test.rb
@@ -4,6 +4,11 @@ class RubygemsHelperTest < ActionView::TestCase
4 4
   include Rails.application.routes.url_helpers
5 5
   include ApplicationHelper
6 6
 
  7
+  def setup