Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML Sitemap Generator #1901

Closed
brusch opened this issue Sep 4, 2017 · 5 comments
Closed

XML Sitemap Generator #1901

brusch opened this issue Sep 4, 2017 · 5 comments
Assignees
Milestone

Comments

@brusch
Copy link
Member

brusch commented Sep 4, 2017

  • Documents
  • Objects by using LinkGenerators
  • Generated asynchronously (maintenance script) and saved as static files in /var/sitemaps
  • Configuration UI for exclusions and other simple settings
@brusch brusch added this to the 5.0.0 milestone Sep 4, 2017
@solverat
Copy link
Contributor

solverat commented Sep 5, 2017

Interesting. PHP-Search, later LuceneSearch, also tried to generate valid xml sitemaps. Since the generating of a valid sitemap can be really, really challenging, we decided to remove this feature from the search bundle in the near future.

Together with our OM team we talked a lot about this "challenge" and we also tried to get all the requirements to the drawing board - but we never managed it to put this into practice with pimcore though :)

  • respect multi sites (multi domains / zones, languages, countries)
  • respect images
  • respect canonicals, hidden, restricted, secret documents / objects / assets
  • respect objects with country / language restrictions and/or limited availability
  • fetch via hard link mapped documents (since there are no "real" documents in subtrees of hard links)
  • generate multiple xml files (categories, special objects for example, there is also a limit from google)

So, i'm really curious about this new feature. :)

@maff
Copy link
Contributor

maff commented Feb 1, 2018

As this is a broad topic with lots of exceptions and special cases which highly depend on the project, #2528 implements an extendable sitemaps framework based on presta/sitemaps-bundle. While the bundle takes care of the sitemap details as writing XML files in a memory efficient way, splitting files based on google limits and handling entries in multiple sections, Pimcore adds a lightweight framework consisting of generators and pluggable filters/processors which can be used to build sitemaps in a customized way.

There's a default DocumentTreeGenerator which traverses the document tree and has support for sites and hardlinks and should work for many simple scenarios, but if not it's either possible to reconfigure it via filters and processors or to use completely custom generators. The documentation (will be available soon) should contain everything worth knowing to get started with custom sitemaps.

@solverat
Copy link
Contributor

solverat commented Feb 1, 2018

@maff: so this generator also respect eg. discover virtual documents in a hardlink sub context?

@maff
Copy link
Contributor

maff commented Feb 1, 2018

Yes, the default document generator also traverses into hardlink children (as I mentioned, this can be completely customized by implementing your own generator). As example from the demo document structure:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <!-- ... -->

    <url>
        <loc>http://pimcore5.loc/en/basic-examples</loc>
        <lastmod>2014-01-03T08:41:44+00:00</lastmod>
    </url>
    <url>
        <loc>http://pimcore5.loc/en/basic-examples/content-page</loc>
        <lastmod>2014-07-21T06:12:58+00:00</lastmod>
    </url>

    <!-- ... -->

    <url>
        <loc>http://pimcore5.loc/en/advanced-examples/hard-link</loc>
        <lastmod>2013-10-28T10:27:22+00:00</lastmod>
    </url>
    <url>
        <loc>http://pimcore5.loc/en/advanced-examples/hard-link/basic-examples</loc>
        <lastmod>2014-01-03T08:41:44+00:00</lastmod>
    </url>
    <url>
        <loc>http://pimcore5.loc/en/advanced-examples/hard-link/basic-examples/content-page</loc>
        <lastmod>2014-07-21T06:12:58+00:00</lastmod>
    </url>

    <!-- ... -->
</urlset>

@solverat
Copy link
Contributor

solverat commented Feb 1, 2018

Perfect, great work - as usual ;). Thanks @maff!

@maff maff closed this as completed in #2528 Feb 1, 2018
@brusch brusch modified the milestones: 5.2.0, 5.1.3 Feb 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants