Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Sfx4Solr #2

Merged
merged 57 commits into from

2 participants

@scotdalton
Owner

This is at least ready to have someone else look at it, if not for an actual pull into master.

@scotdalton

@jrochkind I'm not sure how to include this in the umlaut require. I'll bet probably something in the Umlaut::Engine class, but figured I'd ask before going down a rabbit hole.

@jrochkind, actually, never mind. i added the require to Umlaut::Engine and it seems to have worked.

Owner

I am a bit non-plussed about including a sunspot dependency in Umlaut, and requiring sunspot in every umlaut launch, whether or not it's being used. sunspot is for indexing to Solr, yes?

Can you put the 'require sunspot' in the file that, well, requires sunspot, instead of in Umlaut::Engine, so it only gets required into the app if you are using the Search module that requires it? Or I guess more to the point, in the rake task or whatever else you're providng that uses it? Will the search module actually use it, or will the search module just connect to a solr index, and it's only the indexing code that uses it?

I'm not sure where to move it since I think it needs to be available at class loading time, which happens in Umlaut::Engine (right?).
Sunspot hooks into ActiveRecord to provide a class method called searchable where you configure your Solr index.
https://github.com/sunspot/sunspot/blob/master/sunspot_rails/lib/sunspot/rails/railtie.rb#L6

I'm configuring searchable in an Sfx4 model:
a3d84ef#L9R38

I'm definitely open to suggestions.

Also, what's your concern about adding the sunspot dependency to Umlaut? If a Umlaut user doesn't want to use it, they don't have to and it shouldn't complain, but if they do want to use it, they have the option. Of course, I'll need to document the feature a bit more for it to be useful to anyone but me.

Owner

Hmm, "class-loading time" is ambiguous, but looking at the sunspot code you helpfully link to, I see it needs to be present at rails init, which is less ambiguous, and yeah, I see the problem.

However, the fact that it hooks into ActiveRecord makes me even less enthused about doing it automatically. If you don't want to use this feature, you really don't want Sunspot monkey patching or modifying ActiveRecord -- because tehre's a chance it will hurt AR performance, or have a bug and mess up AR, or whatever.

Also, in general, every gem dependency you add -- particularly if it's a large one -- can increase boot time for a rails app.

What about not including sunspot as a gem dependency at all -- but instead providing instructions that if someone wants to use this Search Method, they should add sunspot as a dependency to their own local app, and follow sunspot directions for install (such as perhaps adding the 'require' to your config/application.rb if that's what sunspot wants).

Or alternately extracting this search method into a gem of it's own, umlaut-sfx-sunspot-search or something. That gem could have the sunspot dependency and could have the 'require' in it's own engine.rb file -- if you want this search method, you just add that gem to your app. At which point the rest of Umlaut can see it's classes, including the search method module, everything would work smoothly.

What do you think?

Cool. I'm thinking leave the code in but check for the Sunspot module. If it's not there, don't run the code. Then individual applications can add the sunspot_rails gem dependency and require it in their own application.rb.

Does that sound like it would work? I'd still add it as a development dependency in the umlaut gem for testing.

Owner
app/controllers/search_methods/sfx4_solr/README.md
((13 lines not shown))
+ require "sunspot_rails"
+
+Then run
+
+ $ bundle install
+ $ rails generate sunspot_rails:install
+
+This will create a file config/sunspot.yml. You can configure the information about your Solr instance here.
+If you don't feel like running your own solr instance, [Websolr](http://websolr.com/) has built in Sunspot support.
+
+You'll also need to specify your SFX4 DBs in your databases.yml. The configuration names are different from the Sfx4
+SearchMethod in order to avoid (promote?) ambiguity and confusion.
+
+ sfx4_global:
+ adapter: mysql2
+ database: sfxglb41
@jrochkind Owner

these examples should include host/port keys, no? you'r going to have to configure sfx db wtih host/port, so AR can find it, yes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
app/controllers/search_methods/sfx4_solr/local.rb
@@ -0,0 +1,10 @@
+module SearchMethods
+ module Sfx4Solr
+ module Local
+ def az_title_klass
+ Module.const_get(:Sfx4).const_get(:Local).const_get(:AzTitle)
+ end
+ include SearchMethods::Sfx4Solr::Searcher
+ end
+ end
+end
@jrochkind Owner

can you explain what this file does, what it's for?

@jrochkind Owner

Why Module.const_get(:Sfx4).const_get(:Local).const_get(:AzTitle) instead of the straightforward Sfx4::Local::AzTitle?

@scotdalton Owner

I need to search multiple instances of SFX so I create multiple Searchers in this same way. I wanted an easy way to include and configure this functionality. Abstracting out the Searcher bit and having the outer module tell the searcher what it's AzTitle class is (and hence SFX db connection) was how I did it. This also allows Sunspot to easily reindex my multiple AzTitles in Solr.

Re: the const_get stuff, I think I was having an issue finding the module, but I'll see it that is still an issue.

@scotdalton Owner

Yeah, I ended up needing the const_get stuff. Otherwise it was looking in the current module and not finding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
app/controllers/search_methods/sfx4_solr/searcher.rb
@@ -0,0 +1,94 @@
+module SearchMethods
+ module Sfx4Solr
+ module Searcher
+ def self.included(klass)
+ klass.class_eval do
+ extend SearchMethods::Sfx4::UrlFetcher
+ include InstanceMethods
+ end
+ end
+
+ module InstanceMethods
@jrochkind Owner

I dont' think there's any reason to put your instance methods in a submodule InstanceMethods and then include them. This is something you see in the wild in rubyland that has no purpose whatsoever. Just include your instance methods directly in the module Searcher, and then they'll be automatically included when Searcher is included, because that's how ruby modules work. You follow me?

I also wonder if there's a way to avoid this UrlFetcher business, still trying to understand what's going on there.

@scotdalton Owner

I kinda like the convention because it makes it clear what's an instance method and I find it more readable, but yeah, it's redundant. I'll go ahead and do it in the less redundant way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
test/dummy/db/migrate/20120927163304_sfx4_global.rb
@@ -0,0 +1,15 @@
+# Create a test SFX Global DB.
+class Sfx4Global < ActiveRecord::Migration
+ def connection
+ ActiveRecord::Base.establish_connection(:sfx4_global)
+ ActiveRecord::Base.connection.initialize_schema_migrations_table
+ ActiveRecord::Base.connection
+ end
+
+ def change
+ create_table "KB_OBJECTS", {:id => false} do |t|
+ t.integer "OBJECT_ID", :default => 0, :null => false, :limit => 8
+ end
+ execute "ALTER TABLE KB_OBJECTS ADD PRIMARY KEY (OBJECT_ID);"
+ end
+end
@jrochkind Owner

wait, why are there migrations on the SFX db? We dont' want to change the sfx db, do we?

@scotdalton Owner

I added these for the sunspot piece because I was using a local Solr index (provided by sunspot_solr). I ended up just using VCR for that Solr bit and didn't run a local instance, but I still use these migrations as a quick test of the Sfx4 search module. I db:migrate them in travis and add some SFX fixtures and can do some sanity testing on the Sfx4 searching.

@jrochkind Owner
@scotdalton Owner

Yeah, I definitely need to document it. I added a travis flag to the DB config which I subsequently forgot to check. I'll check that (maybe change it to mock_database) and raise an error if it's not set. These migrations are only in the test/dummy app and the db names in that database.yml should indicate test (sfxlcl41_test, sfxglb41_test).
I'll additionally comment the test/dummy/config/database.yml with the purpose of these.

Other thoughts on ways to make this less harmful?

@jrochkind Owner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jrochkind
Owner

I also in general don't like the proliferation of SFX models we've got now. app/models/sfx_db, and ./sfx4/local and ./sfx4/global, three sets of models for talking to SFX! What are the reasons you couldn't re-use the existing sfx_db models, maybe fork them into local/global, instead of adding a new set?

Also, need to talk about the way you're using "establish_connection", I think there's a better way.

Also have some overall questions/suggestions about the architecture you're setting up for the "SFX-controlled URLs" stuff, I am not sure it's good to couple it so tightly to search method.

Set up an IRC appointment?

@scotdalton
Owner

Yeah, the SFX models are out of hand. I didn't want to overwrite any of your stuff, but it can definitely be cleaned up/merged. Do we even still need to support SFX 3?

Re the SFX URLs, I didn't want to stray too far from what you'd done. I had a thought that different services might control different URLs. It would be a significant refactor, but if a service had a fetch_controlled_urls method implemented, the nightly maintenance could fetch them and associate them with the given service. Then other services could defer (or not) for URLs they don't control. This could even be part of the standard service config (e.g. defer: true).

@scotdalton
Owner

@jrochkind I cleaned up the SFX models as we discussed and updated some documentation/comments. Take a look and let me know if we're good to merge or if I forgot anything.

Additionally, I'll delete the travis branch since it's been fully merged with this one.

Thanks!

@jrochkind
Owner
@jrochkind
Owner

Have we removed sfx3 support, do you think? If so, we should remove the sfx3 search adapter and any references to it, and document that while core link resolving behavior works with sfx3 or 4 (the sfx plugin), A-Z list type functionality only works with sfx4.

@jrochkind
Owner

also travis is saying this branch is failing right now.

@scotdalton
Owner

We have removed SFX 3 support. I'll document that in Wiki. I renamed the database.yml to travis_database.yml and then told travis to mv it to the proper location before running tests. I put in some more code to try to ensure nothing bad happens with the SFX testing.

@scotdalton
Owner

I updated the Developing Wiki page with updated testing documentation and added a README that reflects the same content to the ./test directory.

I think this branch should be pretty close now.

@jrochkind
Owner
@jrochkind
Owner

Okay, I'm reviewing how the testing infrastructure works in umlaut -- it's probaby always been pretty sketchy, but now that we have more real tests (thanks!), it matters. So I'm partially just going through to figure out how to document "how you should run tests".

if I check out a fresh sfx4solr branch and try to run 'rake test', I get No such file or directory - /home/rochkind/umlaut_fresh/test/dummy/config/database.yml Confirmed this is true of master too, it's not new to your branch.

So, okay, i'm expected to make my own database.yml pointing to a real mysql -- how can we make this easier? I wonder if sqlite3 would work for testing, even though umlaut in general probably doesn't. I thought for travis you had added a database.yml file of some kind -- but can't find it in the repo? I think it should be there at ./config/travis-database.yml or something, but I'm having trouble finding that -- but if it's working for travis, it must be there somewhere, what am I missing?

@jrochkind
Owner

Ah, I didn't actually have the latest version of the sfx4solr branch. okay moving forward.

@jrochkind
Owner

okay, i'm still not sure what the best way to document to run the tests manually is. It's kind of a mess with those SFX migrations. What do you think? You think we need to tell people to set up a mock sfxlcl and sfxglb database in mysql?

For read-only stuff (everything EXCEPT the sunspot stuff) it would be safe to point this at a REAL SFX. Except the existence of those migrations makes it REALLY dangerous to do that.

I'd like to set up a way, as easy as possible, for people to run all the tests EXCEPT the sunspot ones (because those require extra setup). Any ideas as to how we might do that? What might be a part of that, any ideas of how to avoid having those migrations that touch the sfx_db and sfx4_global (normally in production your real SFX, and suitable for testing against any non-sunspot behavior using your real SFX) -- so there's no chance of accidentally running them against your real SFX?

@jrochkind
Owner

Ah, I see the "mock_instance" thing, okay, that's good. Still thinking on this in general, please do share any ideas you have!

@jrochkind
Owner

okay, moving along with trying to run the tests by hand:

./test/unit/primo_service_test.rb: nyu_only_tests(self.name) do

I get undefined methodnyu_only_tests`

I forget if this is something I added or you! Where is this supposed to be defined? Did you forget to check something into git? (Although then I have no idea how it works on travis).

@jrochkind
Owner

okay, you removed the def nyu_only_tests from test_helper as no longer neccesary, but in the source in the repo, nyu_only_tests is still being CALLEd in the primo tests -- and there's no VCR cassettes for primo tests (which is presumably what makes it no longer neccesary?). Did you forget to push some changes?

I am also worried how this could POSSIBLY be passing on travis. You sure the travis is actually working to run our tests?

@scotdalton
Owner

So, yeah sorry! I'm not sure what happened on that last push and why travis seemed to like it. The tests are failing for me as well now (which is good, I guess). I pushed a couple of changes that I hope will fix some problems, but I'm not exactly sure what's happening on travis now. I'm going to specifically flunk a test on and push to travis. That will at least let us know that travis is running tests and failing.

I'll keep investigating.

Again, sorry about that.

@scotdalton
Owner

So yeah, travis passes after I told it to flunk. That's not good. I'll continue to investigate.

@scotdalton
Owner

Ok, so I understand what's happening. When I did my migration, I told travis to go to the dummy directory, but didn't realize I needed to tell it to go to the root dir to run tests. My bad. Running tests correctly (and failing) now.

@scotdalton
Owner

OK, travis is looking much better now. 1.8.7 is failing for sunspot since Module.const_defined? doesn't find :Sunspot. I'll tackle that tomorrow. 1.9.3 is now passing.

@jrochkind
Owner
@scotdalton
Owner

So travis is finally really passing. Phew.

@scotdalton
Owner

Travis Testing Overview

In the before script: section of the .travis.yml file 3 main actions happen

  1. Travis creates three dbs with the following before script: entries

    • mysql -e 'create database umlaut3_test;'
    • mysql -e 'create database sfxlcl41_test;'
    • mysql -e 'create database sfxglb41_test;'
  2. Travis then renames the file ./test/dummy/config/travis_database.yml to database.yml

    • mv test/dummy/config/travis_database.yml test/dummy/config/database.yml
  3. Travis then migrates the dummy application migrations, include the SFX 4 mock instances.

    • cd test/dummy && RAILS_ENV=test bundle exec rake --trace db:schema:load db:migrate && cd ../../

Then travis just runs bundle exec rake

SFX Mock Tests

SFX fixtures are only loaded if the SFX db has been configured with mock_instance: true. SFX search assertions are only executed if the SFX db has been configured with mock_instance: true. If the SFX db has been configured but not configured as a mock_instance no fixtures are loaded, no assertions are executed but the search is run to ensure that it does not raise errors.

VCR Cassettes

VCR cassettes are currently set up for Sfx4Solr, Google Books and Primo and are saved in .test/vcr_cassettes. A VCR support module, TestWithCassette is provided in ./test/support/test_with_cassette.rb to make VCR testing easier. The module was leveraged from jrochkind's BentoSearch

@jrochkind
Owner

okay, if I check out the sfx4solr branch, set up a ./test/dummy/config/database.yml to point to a test database (not worrying about SFX yet), and run tests....

I still get one failing test (as well as a bunch of garbage in the test output, but thats' another story)....

test_missing_id(PermalinksTest):
ActiveRecord::RecordNotFound: Couldn't find Permalink with id=999999999

/home/rochkind/umlaut/test/integration/permalinks_test.rb:11:

Any ideas? Is this test something you did, or a pre-existing test that's my responsibility/fault?

@jrochkind
Owner

Also, the soap4r gem is causing a deprecation warning for using 'iconv'. I had to hack like hell to figure out it was soap4r that was triggering this. So at least making a note of it. You really need soap4r, eh? Feel like submitting a patch to them to fix this, and not use iconv in 1.9.3?

/home/rochkind/.rvm/gems/ruby-1.9.3-p194/gems/activesupport-3.2.8/lib/active_support/dependencies.rb:251:in `block in require': iconv will be deprecated in the future, use String#encode instead.
["/home/rochkind/.rvm/gems/ruby-1.9.3-p194/gems/soap4r-ruby1.9-2.0.5/lib/xsd/iconvcharset.rb:9:

@scotdalton
Owner

I didn't add any integration tests, but that doesn't mean I didn't break it with an overzealous add of fixtures. Removed the dummy permalink fixtures, so hopefully that helps.

Re: soap4r, I've had a savon refactor on my TODO list for a while now. This is a good excuse to move forward with it. Hopefully it won't be too painful

@jrochkind
Owner

oddly, looks like the iconv problem has already been fixed in the official 'soap4r' gem, but we're using a weird 'soap4r-ruby-1.9'. It looks like official soap4r claims to be good for ruby 1.9 for some time now -- you might try just switching gem dependency to official soap4r, and seeing if it still works?

Thanks for fixing permalink test failure however you did it -- how come test was failing for me, but not you or travis?

@jrochkind
Owner

and the test you removed wasn't actually important, we don't want/need that test?

@scotdalton
Owner

not sure about travis vs. you (but it wasn't failing for me either). didn't remove a test just some dummy fixtures.

@jrochkind
Owner

huh weird and distressing. have to see if that made the test actualy pass for me, i'll find out tomorrow. There are a LOT of fixtures in there now; if there are more that no tests are actually using, would definitely be good to trim them. Also, if you feel like it, might want to investigate a solution like FactoryGirl instead of rails fixtures. Haven't actually used anything but rails fixtures, but rails fixtures get out of hand fast. Then again, everything might get out of hand fast -- I think the best bet is just to use them as minimally as possible -- I also sometimes manually add the objects I need to the db in my actual test code, with actual ordinary ActiveRecord a = Foo.create ; a.save! lines, which sometimes makes things more transparent than fixtures.

@jrochkind
Owner

i'm still getting the test failure (actually uncaught exception, not test failure) for:

test_missing_id(PermalinksTest):
ActiveRecord::RecordNotFound: Couldn't find Permalink with id=999999999

But if you can't reproduce that failure, guess I'll have to debug myself. Now that we have lots of tests though, I want to get the test running process nice and seamless and known working before merging this branch.

@jrochkind
Owner

Ah, I know the problem -- because Rails controller rescue_with aren't operative during tests. I've run into that before. So what in production would be caught by Rails and result in a 4xx error (what we're testing for), in testing results in an uncaught exception instead. Bah. I consider this a bug in Rails, but when I've run into it and tried to report it before, Rails has not agreed.

What I can't figure out is why you and travis aren't getting the failure. This worries me.

@jrochkind
Owner

okay, I figured out what's wrong, it's a problem with my code for sure. My code only installs certain exception rescuers if consider_all_requests_local is not true. But it's true in 'test'. And yet I have a test htat will only succesfully pass if the exception rescuers are in effect. So okay, it's broken.

What I can't figure out is a) How this EVER worked for me, it presumably passed when i wrote it, but more disturbingly: b) How it's working for you or travis. I am worried that this test is not in fact being executed for you or travis, which is worrying.

/home/rochkind/umlaut/test/integration/permalinks_test.rb, can you verify that you (and travis) are really running this test? And it's really passing somehow? Can you try to run this test file on it's own, and tell me if it passes for you?

@scotdalton
Owner

OK, so I don't have that test and I don't see it in master. Do you just have it locally?

@jrochkind
Owner

sorry for the delay, trying to get back to this.

I don't like the "Solr isn't loaded not using solr features" message -- this is an optional feature, there is no reason to output that on every boot if someone isn't using it. Can I get rid of that?

jrochkind and others added some commits
@jrochkind jrochkind permalink not found proper 404
proper integration test. No longer uses raising exception and rescue_from,
it was hard to test, because it's a screwy design. the original method
gets more confusing now, but so be it.
5a6dec5
@jrochkind jrochkind Merge branch 'sfx4solr' of github.com:team-umlaut/umlaut into sfx4solr 60bb07d
@jrochkind jrochkind notes on testing infrastructure in README ce6a931
@jrochkind jrochkind don't warn sunspot::rails not loaded on every startup
let alone on every time sunspot? is called. This is an optional feature I
expect most won't use. does not need to log that it's not being
used on every startup
669c6e2
@scotdalton scotdalton Fix bug in Sfx4Solr searcher whn we have a valid issn, isbn or lccn. e2c24cb
@scotdalton
Owner

Yeah, feel free to get rid of the "Solr isn't loaded not using solr features" message. (Looks like you already did.)

@jrochkind
Owner

okeydoke, I sign off on merging this into master! You want to do the honors?

I have tagged a 3.0.4 release before merging this in, in case someone ends up needing a "everything that was done before this refactoring" checkpoint.

I'm thinking we wait until the bootstrap stuff is merged in too (excited to start looking at that!), and then release a 3.1 (or even 4? Nah, 3.1 I think).

@scotdalton scotdalton merged commit 014521c into from
@scotdalton
Owner

Awesome! Done.

Thanks,
Scot

@jrochkind
Owner

sweet. delete the sfx4solr feature branch now too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 27, 2012
  1. @scotdalton
  2. @scotdalton

    Still trying.

    scotdalton authored
  3. @scotdalton
  4. @scotdalton
  5. @scotdalton
Commits on Sep 30, 2012
  1. @scotdalton
  2. @scotdalton
  3. @scotdalton
  4. @scotdalton
  5. @scotdalton
Commits on Oct 1, 2012
  1. @scotdalton

    Update GBS test to use VCR, refactor to use TestWithCassette from @jr…

    scotdalton authored
    …ochkind's bento_search and move sunspot_rails require statement to Umlaut::Engine.
  2. @scotdalton
Commits on Oct 2, 2012
  1. @scotdalton

    Add/update tests for sfx4 searching and sfx4solr searching. Add instr…

    scotdalton authored
    …uctions for creating SearchMethods in README. Configured sfx_db to point to test SFX4.
  2. @scotdalton

    Update to remove sunspot_rails gem dependency and only include Sfx4So…

    scotdalton authored
    …lr functionality if the application has explicitly added it.
  3. @scotdalton
  4. @scotdalton
Commits on Oct 10, 2012
  1. @scotdalton
  2. @scotdalton
  3. @scotdalton

    Merge with travis.

    scotdalton authored
  4. @scotdalton
  5. @scotdalton

    Update test/dummy db configs and migration to raise an error if it's …

    scotdalton authored
    …not explicitly a test_instance.
  6. @scotdalton
  7. @scotdalton
  8. @scotdalton
  9. @scotdalton
  10. @scotdalton
Commits on Oct 11, 2012
  1. @scotdalton
  2. @scotdalton

    Update documentation.

    scotdalton authored
  3. @scotdalton
  4. @scotdalton

    Update to make test pass in general and not just for travis. Also hav…

    scotdalton authored
    …e SFX db models check if their configs are set.
  5. @scotdalton

    Add the SFX4 global db definition to the generator with comments indi…

    scotdalton authored
    …cating that it is only used for Sfx4Solr.
  6. @scotdalton
  7. @scotdalton
Commits on Oct 12, 2012
  1. @scotdalton

    Add test README file.

    scotdalton authored
  2. @scotdalton
Commits on Oct 16, 2012
  1. @scotdalton
  2. @scotdalton

    Intentionally flunking.

    scotdalton authored
  3. @scotdalton
  4. @scotdalton
  5. @scotdalton

    Update schema version.

    scotdalton authored
  6. @scotdalton

    Update SFX db migrations.

    scotdalton authored
  7. @scotdalton

    Update schema version.

    scotdalton authored
  8. @scotdalton
  9. @scotdalton
  10. @scotdalton
  11. @scotdalton
  12. @scotdalton
  13. @scotdalton
  14. @scotdalton
  15. @scotdalton
Commits on Oct 17, 2012
  1. @scotdalton
Commits on Nov 7, 2012
  1. @scotdalton
  2. @jrochkind

    permalink not found proper 404

    jrochkind authored
    proper integration test. No longer uses raising exception and rescue_from,
    it was hard to test, because it's a screwy design. the original method
    gets more confusing now, but so be it.
  3. @jrochkind
  4. @jrochkind
  5. @jrochkind

    don't warn sunspot::rails not loaded on every startup

    jrochkind authored
    let alone on every time sunspot? is called. This is an optional feature I
    expect most won't use. does not need to log that it's not being
    used on every startup
  6. @scotdalton
Something went wrong with that request. Please try again.