Skip to content
This repository has been archived by the owner on Aug 8, 2022. It is now read-only.

TSFE->isStaticCacheble #271

Open
phathoang opened this issue Nov 10, 2016 · 6 comments
Open

TSFE->isStaticCacheble #271

phathoang opened this issue Nov 10, 2016 · 6 comments

Comments

@phathoang
Copy link

In both hooks, typoLink_PostProc and pageIndexing there's a condition that checks $TSFE->isStaticCacheble()

Is it really needed? In the typoLink_PostProc hook it doesn't make much sense. If I'm on a page with links to 30 subpages, I'd want the subpages to be in the sitemap, even if the page itself has non-cacheable content.

@thomaszbz
Copy link
Member

thomaszbz commented Nov 10, 2016

@phathoang Thanks for pointing this out. So, in terms of link structure, your use case would be

- page (non-cacheable)
    - linked page 1 (cacheable)
    - linked page 2 (eventually cacheable)
    - linked page ... (eventually cacheable)

Right?

We'll have to investigate what actually happens. To my knowledge, linked page 1 should still occur in the sitemap as soon as it is http-requested for the first time. If this is not sufficient for your use case, please report back why you need it differently. If you can suggest a concept which has not a lot of drawbacks, I'd be happy to hear about it.

@phathoang
Copy link
Author

phathoang commented Nov 11, 2016

That's right, so if we have 30 subpages we need to visit them all to get them into the sitemap.

The main problem is that $TSFE->isStaticCacheble() also checks if there's anything non-cacheable, so if there's a page.5 = COA_INT (etc..), the sitemap will not generate at all ( probably similar issue in #186 and allowNoStaticCachable was added to "fix" it ).

What problems can it cause, if we remove the check of USER/COA_INTS?
Consider this example when visiting 'page':

  • page (has COA_INT)
    • linked page 1 (cacheable)
    • linked page 2 (no_cache)

versus:

  • page (cacheable)
    • linked page 1 (cacheable)
    • linked page 2 (no_cache)

If we remove the check of USER/COA_INTS then page 1 and 2 will be added to the sitemap in the first example, but page 2 has no_cache so is it wrong? No, because the same thing happens in the second example as the plugin is now. 'Page' itself has nothing to do with the links it generates so I think it's strange to check $TSFE->isStaticCacheble() (or at least the isINTincScript() in that function) in the typoLink_PostProc hook, and maybe even in the pageIndexing hook.

@thomaszbz
Copy link
Member

thomaszbz commented Nov 11, 2016

@phathoang You seem to be close to a fix that meets your needs. Would you like to provide a pull request?

I'd like to review your fix and see if breaks something.

For your PR, please add yourself to the list of contributors.

@thomaszbz
Copy link
Member

thomaszbz commented Dec 12, 2016

As far as I can see, you fixed it with plugin.metaseo.sitemap.index.allowNoStaticCachable or plugin.metaseo.sitemap.index.allowNoCache. That's indeed a workaround and not the nicest one.

I think, one reason to check against the cache is to sort out unwanted parameters (see #254 for details). That requires 6.2.28+ or 7.6.12+ to work as expected (see #268). Eventually, we need to improve that a bit for your use case: There should be a fallback for non-cacheable pages (eventually remove all parameters or do something like whitelisting or look if the canonical-handling can be applied here, too.).

And/or do the check (which might fail), but then process the subpages at least.

There's also a performance aspect: If we don't check the cache, metaseo thinks that a page changed at every request. The page then gets indexed every time, including child pages. That could slow down a site considerably. For the purpose of checking for changes, we could fall back to SYS_LASTCHANGED, eventually. That's for changed content at least, whereas the timestamp of the cache entry would also cover other changes (like template, etc.)

This comment is open for further investigation, it's rather things that need to be considered for an investigation.

@thomaszbz
Copy link
Member

thomaszbz commented Dec 12, 2016

@phathoang I'd also be interested in the question what was leading to the circumstance that pages were not cached by TYPO3. Were you testing in TYPO3's development mode or logged in, eventually? Maybe we should extend FAQ and documentation in that respect if that was the reason. Or was it some extension we should be aware of?

@phathoang
Copy link
Author

@thomaszbz
Actually I was looking into this because of tq_seo being used on an older installation. In tq_seo the TSFE->isStaticCacheble check is only done in one of the two hooks, which explains why #186 was working with tq_seo but not metaseo.

In our case we're having pw_comments inserted into a page as non-cacheable plugin. So it's not that TYPO3 doesn't cache the page, it does, but not everything since there's non-cachable plugin. So the caching works properly but the pages won't get into the sitemap because of TSFE->isStaticCacheble unless we set plugin.metaseo.sitemap.index.allowNoStaticCachable. We actually removed the whole check of TSFE->isStaticCacheble in our tq_seo so that's why I'm here questioning why it is needed there.

About unwanted parameters, like /?pk_campaign=abc, shouldn't be a problem. Not sure how it happens in #254 but looking at the code, /?pk_campaign=abc shouldnt appear in the sitemap. In page-indexing hook there's a getPageUrl function which checks if there's a cHash, if not, the url will be generated with

             $linkConf = array(
                 'parameter' => $GLOBALS['TSFE']->id,
             );
             $ret = $GLOBALS['TSFE']->cObj->typoLink_URL($linkConf);
             $ret = $this->processLinkUrl($ret);

which shouldn't contain the unwanted get-parameter since there's no cHash.

And in the link-hook it shouldn't be a problem either, since the generated links don't have the unwanted parameter.

About the performance, I'm fine with checking cache, but I don't see why there's a need to check for non-cachables. I'm not able to submit a pull-request atm but I think I would have just removed the this part of code in FrontendUtility::isCacheable

        // don't parse if page is not cacheable
        if (empty($conf['allowNoStaticCachable']) && !$TSFE->isStaticCacheble()) {
            return false;
        }

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants