Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot search for "subject OR subject" in a simple search field #15

Closed
phluid61 opened this issue Jan 22, 2013 · 2 comments
Closed

cannot search for "subject OR subject" in a simple search field #15

phluid61 opened this issue Jan 22, 2013 · 2 comments
Assignees
Labels
Milestone

Comments

@phluid61
Copy link
Contributor

@phluid61 phluid61 commented Jan 22, 2013

Searching for subjects in a simple search field using merge=ANY causes an SQL error.

Here is an illustrative example:

Step 1

Add the following search form to archive/cfg/cfg.d/search.pl :

$c->{search}->{errortest} = {
    search_fields => [
        { meta_fields => [ "title", "subjects" ], id => "q" },                                                                                        
    ],
    preamble_phrase => "cgi/search:preamble",
    title_phrase => "cgi/search:title",
    citation => "result",
    page_size => 20,
    order_methods => { "byyear" => "-date/creators_name/title" },
    order_menu => [ "byyear" ],
    default_order => "byyear",
    show_zero_results => 1,
};

Since the field searches both title and subjects, we have to enter the subject code instead of choosing it from a list.

Step 2

Search for: [any of] "070402 070402"

Note: searching for [all of] the above works and finds the appropriate subject.

Step 3:

In perl_lib/EPrints/Search.pm, in sub perform_search add the line: print STDERR $self->get_conditions->describe; before generating a new cache entry.

Outcome

In the error log we see:

AND(
        =($archive.metadata_visibility,"show") ... eprint,
        =($archive.eprint_status,"archive") ... eprint,
        OR(
                OrSubQuery(
                        in_subject($archive.subjects,"mathematics") ... eprint_subjects,
                        in_subject($archive.subjects,"accountancy") ... eprint_subjects
                ),
                OrSubQuery(
                        index($archive.title,"mathematic") ... eprint__rindex,
                        index($archive.title,"accountancy") ... eprint__rindex
                )
        )
)

and the following error:

DBD::mysql::db do failed: Unknown column '140549149986120subject_ancestors.ancestors' in 'where clause' at /usr/local/eprints/perl_lib/EPrints/Database.pm line 3213.
[myeprints 2013-01-22 10:06:24] SQL ERROR (do): INSERT INTO `cache8841073`(`pos`, `eprintid`) SELECT @i:=@i+1, `eprintid` FROM (SELECT `eprint`.`eprintid` FROM `eprint` LEFT JOIN `eprint__ordervalues_en` ON `eprint`.`eprintid`=`eprint__ordervalues_en`.`eprintid`, (SELECT `eprint`.`eprintid` AS `eprintid` FROM `eprint`, (SELECT `eprint`.`eprintid` AS `eprintid` FROM `eprint`, `eprint_subjects` AS `eprint_subjects`, `subject_ancestors` AS `140549149756928subject_ancestors` WHERE `eprint`.`eprintid`=`eprint_subjects`.`eprintid` AND `eprint_subjects`.`subjects`=`140549149756928subject_ancestors`.`subjectid` AND (`140549149756928subject_ancestors`.`ancestors` = 'mathematics' OR `140549149986120subject_ancestors`.`ancestors` = 'accountancy') UNION SELECT `eprint`.`eprintid` AS `eprintid` FROM `eprint`, `eprint__rindex` AS `eprint__rindex` WHERE `eprint`.`eprintid`=`eprint__rindex`.`eprintid` AND (`eprint__rindex`.`field`='title' AND `eprint__rindex`.`word`='mathematic' OR `eprint__rindex`.`field`='title' AND `eprint__rindex`.`word`='accountancy')) AS `or_140549149716736` WHERE `eprint`.`eprintid`=`or_140549149716736`.`eprintid`) AS `and_140549150014272_0` WHERE `eprint`.`eprintid`=`and_140549150014272_0`.`eprintid` AND `eprint`.`metadata_visibility` = 'show' AND `eprint`.`eprint_status` = 'archive' GROUP BY `eprint`.`eprintid`, `eprint__ordervalues_en`.`date`, `eprint__ordervalues_en`.`creators_name`, `eprint__ordervalues_en`.`title` ORDER BY `eprint__ordervalues_en`.`date` DESC, `eprint__ordervalues_en`.`creators_name` ASC, `eprint__ordervalues_en`.`title` ASC) `S`
[myeprints 2013-01-22 10:06:24] SQL ERROR (do): Unknown column '140549149986120subject_ancestors.ancestors' in 'where clause' (#1054)

The significant portion is the left-hand side of the UNION, corresponding with the first OrSubQuery:

SELECT `eprint`.`eprintid` AS `eprintid`
FROM
    `eprint`,
    `eprint_subjects` AS `eprint_subjects`,
    `subject_ancestors` AS `140549149756928subject_ancestors`
WHERE `eprint`.`eprintid`=`eprint_subjects`.`eprintid`
AND `eprint_subjects`.`subjects`=`140549149756928subject_ancestors`.`subjectid`
AND (
    `140549149756928subject_ancestors`.`ancestors` = 'mathematics' OR
    `140549149986120subject_ancestors`.`ancestors` = 'accountancy'
)

Note that the first three instances of the aliased subject_ancestors table have a different prefix from the fourth. This is a direct consequence of the change at:

http://trac.eprints.org/eprints/changeset/6251/trunk/system/perl_lib/EPrints/Search/Condition/InSubject.pm

Note: Our repository is forked from ePrints 3.2, however based on code inspection I believe the same issue would occur in trunk, if anyone ever decided to simple-search for subjects.

@ghost ghost assigned timbrody Jan 22, 2013
@timbrody timbrody closed this in 80f9c52 Jan 22, 2013
@timbrody
Copy link
Contributor

@timbrody timbrody commented Jan 22, 2013

Please confirm that this resolves your issue. The trouble with search bugs is avoiding collateral damage ... given the complexity of generating the SQL.

@phluid61
Copy link
Contributor Author

@phluid61 phluid61 commented Jan 22, 2013

I confirm this patch fixes the error.
Thanks!

timbrody pushed a commit that referenced this issue Feb 6, 2013
Fixes #15

InSubject mints a new prefix in join and logic, but SubQuery (as inherited by
OrSubQuery) only uses the join from the first sub-op. Hence the perfix used in
logic for any other ORed InSubject sub-ops would not match the prefix used in
the join.

Fixed by always setting a prefix in the call to join() and logic() - empty
string so we don't make the SQL any more difficult to read than it already is.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants