Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contao version/ build not updating in backend #1678

Closed
verkruemelt opened this issue Dec 14, 2018 · 18 comments
Closed

Contao version/ build not updating in backend #1678

verkruemelt opened this issue Dec 14, 2018 · 18 comments

Comments

@verkruemelt
Copy link

I had run an update via composer:
image
core bundle was updated to 4.4.31, I assumed I can see this version number in the backend too.
But unfortunatley the version in the backend still shows 4.4.20:
image
as of installation.
The same in a test and production enviornment.

Is this a bug or an undocumented feature? 😄

@fiedsch
Copy link
Contributor

fiedsch commented Dec 14, 2018

I can't confirm this, have never experienced this.

In addition: the composer log states, that the update was from 4.4.30 to 4.4.31! So just to be sure: did you check the correct backend?

@verkruemelt
Copy link
Author

Das ist richtig. Ich habe heute das Update durchgeführt von 4.4.30 zu 4.4.31. Auf beiden Systemen.
Aber im Backend wird mir einmal die Version 4.4.20 angezeigt und im zweiten System die 4.4.28.

Dies müsste jeweils die Version sein, mit der das jeweilige Contao installiert wurde - wenn ich mich richtig erinnere. Ich habe auch jeweils eine Systemwartung laufen lassen und die jeweilige Webseite per [Strg] + [F5] komplett neu laden lassen. Aber die alte Versionsnummer wird noch immer angezeigt.

@Toflar
Copy link
Member

Toflar commented Dec 14, 2018

Maybe a modified be_main template?

@ausi
Copy link
Member

ausi commented Dec 14, 2018

Your screenshot of the composer output looks like there is something missing. Did you see the following lines in the output?

ocramius/package-versions:  Generating version class...
ocramius/package-versions: ...done generating version class

Please check if the file vendor/ocramius/package-versions/src/PackageVersions/Versions.php contains the correct version for contao/core-bundle.

@leofeyer
Copy link
Member

Did you clear the cache?

@verkruemelt
Copy link
Author

Where can I find this template

Maybe a modified be_main template?

Nope

Your screenshot of the composer output looks like there is something missing. Did you see the following lines in the output?

ocramius/package-versions:  Generating version class...
ocramius/package-versions: ...done generating version class

Please check if the file vendor/ocramius/package-versions/src/PackageVersions/Versions.php contains the correct version for contao/core-bundle.

These lines are not in the output.
I can't find the folder <root>/vendor/ocramius, but <root>/vendor/contao/core-bundle/src/Resources/contao/classes/Versions.php. But there is no version neither.

Did you clear the cache?

Yep!

@fritzmg
Copy link
Contributor

fritzmg commented Dec 17, 2018

Post your current composer.json.

@verkruemelt
Copy link
Author

{
    "name": "contao/managed-edition",
    "type": "project",
    "description": "Contao Open Source CMS",
    "license": "LGPL-3.0-or-later",
    "authors": [
        {
            "name": "Leo Feyer",
            "homepage": "https://github.com/leofeyer"
        }
    ],
    "require": {
        "php": "^5.6|^7.0",
        "contao/calendar-bundle": "^4.4",
        "contao/comments-bundle": "^4.4",
        "contao/faq-bundle": "^4.4",
        "contao/listing-bundle": "^4.4",
        "contao/manager-bundle": "4.4.*",
        "contao/news-bundle": "^4.4",
        "contao/newsletter-bundle": "^4.4",
        "terminal42/contao-changelanguage": "^3.1",
	"sensio/framework-extra-bundle": "^3.0.29"
    },
    "conflict": {
    },
    "config": {
        "component-dir": "assets"
    },
    "extra": {
        "branch-alias": {
            "dev-4.4": "4.4.x-dev"
        }
    },
    "scripts": {
        "post-install-cmd": [
            "Contao\\ManagerBundle\\Composer\\ScriptHandler::initializeApplication"
        ],
        "post-update-cmd": [
            "Contao\\ManagerBundle\\Composer\\ScriptHandler::initializeApplication"
        ]
    }
}

Files are identical on prod and test.

@xchs
Copy link
Contributor

xchs commented Dec 17, 2018

Rename composer.lock to composer.lock.bak and run a complete Composer update.

@fritzmg
Copy link
Contributor

fritzmg commented Dec 17, 2018

Also confirm which composer version you are currently using.

@verkruemelt
Copy link
Author

composer --version
Using config.component-dir has been deprecated. Please use extra.contao-component-dir instead.
Composer 1.6.3 2018-01-31 16:28:17

newest version is 1.8.0... will update
after complete composer Update: HTTP 500
Apache Error log:

PHP Fatal error:  Uncaught RuntimeException: Unable to create the store directory (/var/www/html/cms/var/cache/prod/http_cache).

@xchs
Copy link
Contributor

xchs commented Dec 17, 2018

Is the directory writable or have you reached the disk quota?

BTW: We should discuss this in the forums!

@verkruemelt
Copy link
Author

Owner or files/ folders set to www-data after composer update (per script), Disk quta not reached.

@ausi
Copy link
Member

ausi commented Dec 17, 2018

Did you clear the cache?

Yep!

Can you please try if manually deleting the folders var/cache/prod and var/cache/dev resolves the problem?

@aschempp
Copy link
Member

Be aware that the versions are not stored in the cache. They are stored in the vendor folder.

ocramius/package-versions:  Generating version class...
ocramius/package-versions: ...done generating version class

This is the only relevant suggestion. If this does not appear, there is an issue with your composer update. Or you did run it with --no-plugins flag or thelike. Somehow, the versions plugin is not updated. You will most likely find the wrong version in vendor/ocramius/package-versions/src/PackageVersions/Versions.php

@ausi
Copy link
Member

ausi commented Dec 19, 2018

Be aware that the versions are not stored in the cache. They are stored in the vendor folder.

In Contao 4.4 too?

@leofeyer
Copy link
Member

I don't think so.

@leofeyer
Copy link
Member

leofeyer commented Jan 2, 2019

Can you please try if manually deleting the folders var/cache/prod and var/cache/dev resolves the problem?

@verkruemelt Did you try this and did it solve the issue?

leofeyer pushed a commit that referenced this issue Apr 30, 2020
Description
-----------

This pull request improves the search query performance.

1. 792499a moves the word matching into a subselect which seems to help MySQL make better use of the indexes.
2. a2ce8e8 removes the extra counting of wildcards if it is a wildcard-only search because the count is not needed in this case.

In my tests with a `tl_search_index` table with about 1.5 million rows it took down a search like `*foo*` from 10 seconds to 1 second, and `*foo* *bar*` from 12 to 1.3 seconds.

Commits
-------

792499af Improve search query performance
a2ce8e84 Only count wildcard matches if necessary
ec39a0cd Merge branch '4.4' into fix/search-query-performance
leofeyer pushed a commit that referenced this issue Jul 3, 2020
Description
-----------

This PR is based on #1678

It improves the performance of the search further by storing the words in their own table with a unique index.

It also changes how the check works that ensures that all keywords are matched. Should be faster now and also more accurate. Fixes bugs like searching for [`Contao Conta*`](https://contao.org/de/suche.html?keywords=Contao+Conta*)

In my tests with a `tl_search_index` table with about 1.5 million rows it took down a search like `*foo*` from 10 seconds to 0.2 second, and `*foo* *bar*` from 12 to 0.3 seconds.

#### ToDo (for this pull request)
- [ ] ~~Check if some of the optimizations are bug fixes that need to be added to #1678~~
- [x] Rebase once #1678 got merged upstream
- [x] Update the index process to save the words in the new table
- [x] Check how and when to update the `vectorLength` of the documents

#### ToDo (for a contao/search library)
- [ ] Functional Tests (if possible)
- [ ] Move logic to a search service or library
- [ ] ~~Use doctrine entities instead of DCA~~

#### Further ideas
- [ ] Only store the IDs of the results in the cache JSON and load the text from the database when it is used.

Related: https://github.com/contao/core-bundle/issues/242

### How the search works now (updated 2020-06-25):

1. The `tl_search` table holds all documents (pages) as one row for every document.
2. In `tl_search_words` all the words of the whole corpus are stored (one row per unique word) together with the number of documents the word appears in (document frequency)
3. `tl_search_index` is the connection between words and documents (one row for every unique word/document combination) and stores how often the word appears in the document (term frequency)

When we do an actual search we calculate the similarity between the query and all matching documents using the [cosine similarity algorithm](https://en.wikipedia.org/wiki/Cosine_similarity) with [tf-idf weighted](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) vector values:

```
Queryᵢ = log(1+(N/nt))
Documentᵢ = 1 + log ƒt,d

                     ___                                   
                     ╲                                     
                     ╱    Queryᵢ × Documentᵢ           
                     ‾‾‾                                   
cos(ϕ) = ──────────────────────────────────────────────────
               ________________         ___________________
              ╱  ___                   ╱  ___              
             ╱   ╲          2         ╱   ╲             2
            ╱    ╱    Queryᵢ    ×    ╱    ╱    Documentᵢ 
          ╲╱     ‾‾‾               ╲╱     ‾‾‾              
```

This formula results in a similarity score between `0` (doesn’t match at all) and `1` (exact same words as the query).

With *idf* we make sure that rare words in the whole corpus get high weights while very common words get low weights. The `tf` score is used to give words more weight that appear very often in the same document.

The cosine similarity is then used to help normalizing the length of the documents. This means that you cannot “trick” the search index by creating a document that just has every word multiple times in it.

Commits
-------

f0955a7a Improve search query performance
0fcb18d1 Only count wildcard matches if necessary
20e88805 Move search words to their own table
7f0d95e0 Improve search performance
da220368 Use cosine similarity to rank search results
e84a6afc Rename cosineSimilarity back to relevance
b9db1675 Use MySQL variables to prevent multiple count computations
4a48b46f Coding style
803855ac Fix division by zero
bb4cb665 Adapt indexing to the new data structure
d14eb239 Drop search tables instead of migrating the data
e7bb1f0b Remove obsolete language column
08350d09 Add unique index for word and pid
4f48c967 Fix division by zero
a8b7052d Merge branch master into feature/efficient-search-storage

Conflicts:
	core-bundle/src/Resources/contao/library/Contao/Search.php
67fa2c77 Remove unnecessary default values
99c999d1 Update vectorLength of 100 random documents when indexing
51f0dd08 Coding style
c29db11f Coding style
3b6c19ce Comment the vector length update process
fba1e98d Rename tl_search_words to tl_search_term
463ed5b2 Rename tl_search_words to tl_search_term
debb6343 Ensure that the relevance is always above zero
bf861148 CS fixes
a9698c2c Added missing default value for vectorLength
cd2ae6eb Also delete search entries from the tl_search_term table
235bdaf0 Add tl_search_term to maintenance description
3431d421 Fix syntax error
6fe7dc90 Use contao.search.indexer service to purge deleted pages
8e513c43 Fix missing group by clause
d3ec34d2 Cast integer terms to string
14098c95 Fix unsigned value is out of range error
52b69b86 Try to prevent deadlocks
86e7d733 Fix concurrent indexing of the same page
a0e28fad Fix duplicate error for tl_search_index termId-pid
7e96c376 Add index for documentFrequency to prevent deadlocks
dbc587bd Remove obsolete index
dc23f81c Try to fix another deadlock
fb90e2fc Revert "Try to fix another deadlock"

This reverts commit dc23f81cd7844e847ec9001a8e5f298e2974403c.
84c7e433 Fix bug with division by zero
60535614 Lock tables to prevent deadlocks
7a3009e8 CS fixes
8450dcfd Merge branch 'master' into feature/efficient-search-storage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants