how to get xapian document from model #27

Closed
thoughtafter opened this Issue Jan 30, 2012 · 3 comments

Comments

Projects
None yet
2 participants
@thoughtafter
Contributor

thoughtafter commented Jan 30, 2012

I know this is more of a question instead of an issue, unless this cannot be easily done, then it is an issue. I want to get the xapian document from a model. Specifically I'd like to do this to use find_similar_to:

Model.find_similar_to(Model.first.xapian_document)

Or something like that. Of course find_similar_to could potentially be updated to allow a model but I think this will have other applications as well. Basically I'm looking for an inverse of the indexed_object property. I'm thinking there's probably a Xapian method to retrieve a document by the xapian_id but I haven't dug deeply enough to find it.

@gernotkogler

This comment has been minimized.

Show comment Hide comment
@gernotkogler

gernotkogler Jan 31, 2012

Owner

If you want to find the xapian document that belongs to an active record model, you could try the following (not tested):

add index statements to your blueprints for the unique model id like so:

blueprint.index :model_id do
"#{class.name}-#{id}"
end

Now you should be able to search by model id and get max one hit:
XapianDb.search "model_id:address-22"

Owner

gernotkogler commented Jan 31, 2012

If you want to find the xapian document that belongs to an active record model, you could try the following (not tested):

add index statements to your blueprints for the unique model id like so:

blueprint.index :model_id do
"#{class.name}-#{id}"
end

Now you should be able to search by model id and get max one hit:
XapianDb.search "model_id:address-22"

@thoughtafter

This comment has been minimized.

Show comment Hide comment
@thoughtafter

thoughtafter Jan 31, 2012

Contributor

Yes that had occurred to me immediately but it seemed silly to index another field since we're already passing that string in twice, as "data" and as "unique term". It took a few hours of readings docs and source code but I came up with this:

def xapian_docid
XapianDb.database.reader.postlist("Q#{xapian_id}").first.docid
end

def xapian_document
XapianDb.database.reader.document(xapian_docid)
end

That doesn't quite work as expected (with these methods in the model) as the add_doc_helper_methods_to method has not been applied. But I thought maybe you'd have a better idea of how to integrate this into xapian_db than I would.

Contributor

thoughtafter commented Jan 31, 2012

Yes that had occurred to me immediately but it seemed silly to index another field since we're already passing that string in twice, as "data" and as "unique term". It took a few hours of readings docs and source code but I came up with this:

def xapian_docid
XapianDb.database.reader.postlist("Q#{xapian_id}").first.docid
end

def xapian_document
XapianDb.database.reader.document(xapian_docid)
end

That doesn't quite work as expected (with these methods in the model) as the add_doc_helper_methods_to method has not been applied. But I thought maybe you'd have a better idea of how to integrate this into xapian_db than I would.

@gernotkogler

This comment has been minimized.

Show comment Hide comment
@gernotkogler

gernotkogler Feb 5, 2012

Owner

You could do it like this:

def xapian_document
  docid       = XapianDb.database.reader.postlist("Q#{xapian_id}").first.docid
  doc         = XapianDb.database.reader.document dicid
  blueprint   = XapianDb::DocumentBlueprint.blueprint_for self.class.name
  doc.extend blueprint.accessors_module
end
Owner

gernotkogler commented Feb 5, 2012

You could do it like this:

def xapian_document
  docid       = XapianDb.database.reader.postlist("Q#{xapian_id}").first.docid
  doc         = XapianDb.database.reader.document dicid
  blueprint   = XapianDb::DocumentBlueprint.blueprint_for self.class.name
  doc.extend blueprint.accessors_module
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment