Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

changing the content of a page leads to multiple search results for the same page #8439

Closed
fritzmg opened this issue Aug 12, 2016 · 2 comments
Assignees
Labels
Milestone

Comments

@fritzmg
Copy link
Contributor

fritzmg commented Aug 12, 2016

Reproduction

The first result will have [100% relevance] and shows the most recent edit:

end users can only log in to the front end. Lorem ipsum. Lorem ipsum dolor sit amet. Learn

The second result will have [50% relevance] and shows the initial edit:

end users can only log in to the front end. Lorem ipsum. Learn more On the following pages you

Cause

\Contao\Search::indexPage searches for existing entries here, using the checksum of the page's content, and then either creates a new entry or updates the existing one.

$objIndex = $objDatabase->prepare("SELECT id, url FROM tl_search WHERE checksum=? AND pid=?")
                        ->limit(1)
                        ->execute($arrSet['checksum'], $arrSet['pid']);

// Add the page to the tl_search table
if ($objIndex->numRows)
{
    // update existing entry …
}
else
{
    // create new entry …
}

However, since the content has changed, the checksum is different of course, thus after an edit of a page, there will always be a new entry in tl_search - which leads to multiple search results for the same page.

Fix?

Shouldn't old entries with the same url simply be deleted or updated, instead of searching for the checksum? Under what circumstances would an entry get updated with new values? If the page has changed, the checksum will be different. If a page did not change, there is nothing to be updated in the first place, I assume.

@leofeyer leofeyer added this to the 3.5.16 milestone Aug 12, 2016
@leofeyer leofeyer self-assigned this Aug 12, 2016
@leofeyer
Copy link
Member

Should be fixed in fccd682. Please test and confirm.

jsonn pushed a commit to jsonn/pkgsrc that referenced this issue Sep 8, 2016
### 4.2.3 (2016-09-06)

 * Do not double URL encode the content syndication links.
 * Use CSS3 transforms instead of transitions to animate the off-canvas navigation.
 * Improve the exception handling when using the resource locator (see #557).
 * Correctly reset the filter menu in parent view.
 * Support all characters but =!<> and whitespace in simple tokens (see contao/core#8436).
 * Check the user's permission when generating links in the picker (see contao/core#8407).
 * Handle forward pages without target in the navigation modules (see contao/core#8377).
 * Provide the same template variables for downloads and enclosures (see contao/core#8392).
 * Handle %n when parsing date formats (see contao/core#8411).
 * Fix the module wizard's accessibility (see contao/core#8391).
 * Correctly initialize TinyMCE in sub-palettes in Firefox (see contao/core#3673).
 * Validate form field names more accurately (see contao/core#8403).
 * Correctly show the ctime, mtime and atime of a folder (see contao/core#8408).
 * Correctly index changed pages (see contao/core#8439).
@fritzmg
Copy link
Contributor Author

fritzmg commented Sep 15, 2016

Side note: this problem still exists, if you change the content of a page and then open it in the frontend under a different URL (e.g. example.org/ vs example.org/index.html, see also #8460). Though I guess technically that works as expected since a different URL means different content as well (from a client's perspective).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants