Skip to content

Loading…

improve support for multilanguage sitemaps #1557

Closed
wants to merge 1 commit into from

4 participants

@mihaisucan

Working on a multilanguage web site and I needed a sitemap.xml that includes correct URLs for each page, in each language.

Currently django-cms generates URLs without any language prefix, even if I have the multilanguage URL middleware enabled. It also includes all pages at once, but I get the slugs in only one language (the current site language). So it's all a mix up: English-only pages end up getting French slugs and so on.

This pull request does the following:

  • CMSSitemap only includes URLs for the current site language, by default. If I am not mistaken, this allows you to serve multiple language-specific sitemap.xml files - just point to /en/sitemap.xml, or /other-language/sitemap.xml from robots.txt and make sure you correctly setup your urls.py.
  • I included a fix for Title.objects.public() which doesn't seem to work.
  • Included a new sitemaps/utils.py that people can use to generate a single sitemap.xml with all the pages in all languages. Example:
from cms.sitemaps import CMSSitemap
from cms.sitemaps.utils import MakeMultilanguageSitemap
sitemaps = { 'sitemaps': MakeMultilanguageSitemap({
    'cmspages': CMSSitemap,
}),}

I use this to generate correct URLs for all multilanguage pages and apps. This utility function works for me with other sitemap classes as well.

Please let me know if further changes are needed. Thank you!

@digi604 digi604 commented on the diff
cms/models/managers.py
@@ -171,7 +171,7 @@ def get_page_slug(self, slug, site=None):
# created new public method to meet test case requirement and to get a list of titles for published pages
def public(self):
- return self.get_query_set().filter(page__publisher_is_draft=False, page__published=True)
+ return self.get_query_set().filter(page__published=True)
@digi604 Divio AG member
digi604 added a note

This is very bad and will break a lot of things

Do you have better suggestions? I can simply avoid this change, entirely. I can put the full query I need in cms_sitemap.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@digi604
Divio AG member

Please make a test for this change. And run the testsuite. You will see that at the moment this build is broken because of the queryset changes.

@yakky

@digi604, AFAIK this is supersed in 2.4+ by using i18n_patterns

@yakky

Closing, supersed in 2.4

@yakky yakky closed this
@mkoistinen
Divio AG member

Hmmm, so, if this was superseded in 2.4, then why do the sitemaps still only include the EN pages?

@digi604
Divio AG member

sure?

@mkoistinen
Divio AG member

Hmm, well, if I open //sitemaps/sitemap-cmspages.xml, I get only the /en/ urls.

If I then go visit an /es/ url, then come back to the same sitemaps, I get only the /es/ urls.

Is this how it is supposed to work?

@digi604
Divio AG member

nope

@digi604 digi604 reopened this
@yakky yakky referenced this pull request
Merged

Multilanguage sitemaps #2456

@digi604 digi604 closed this in #2456
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Dec 12, 2012
  1. @mihaisucan
Showing with 57 additions and 7 deletions.
  1. +1 −1 cms/models/managers.py
  2. +10 −6 cms/sitemaps/cms_sitemap.py
  3. +46 −0 cms/sitemaps/utils.py
View
2 cms/models/managers.py
@@ -171,7 +171,7 @@ def get_page_slug(self, slug, site=None):
# created new public method to meet test case requirement and to get a list of titles for published pages
def public(self):
- return self.get_query_set().filter(page__publisher_is_draft=False, page__published=True)
+ return self.get_query_set().filter(page__published=True)
@digi604 Divio AG member
digi604 added a note

This is very bad and will break a lot of things

Do you have better suggestions? I can simply avoid this change, entirely. I can put the full query I need in cms_sitemap.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
def drafts(self):
return self.get_query_set().filter(page__publisher_is_draft=True)
View
16 cms/sitemaps/cms_sitemap.py
@@ -1,5 +1,7 @@
# -*- coding: utf-8 -*-
from django.contrib.sitemaps import Sitemap
+from django.utils.translation import get_language
+from cms.models import Title
def from_iterable(iterables):
"""
@@ -14,16 +16,18 @@ class CMSSitemap(Sitemap):
priority = 0.5
def items(self):
- from cms.utils.page_resolver import get_page_queryset
- page_queryset = get_page_queryset(None)
- all_pages = page_queryset.published().filter(login_required=False)
- return all_pages
+ titles = Title.objects.public().filter(page__login_required=False, \
+ language=get_language())
+ return titles
- def lastmod(self, page):
+ def lastmod(self, title):
+ page = title.page
modification_dates = [page.changed_date, page.publication_date]
plugins_for_placeholder = lambda placeholder: placeholder.get_plugins()
plugins = from_iterable(map(plugins_for_placeholder, page.placeholders.all()))
plugin_modification_dates = map(lambda plugin: plugin.changed_date, plugins)
modification_dates.extend(plugin_modification_dates)
return max(modification_dates)
-
+
+ def location(self, title):
+ return title.page.get_absolute_url()
View
46 cms/sitemaps/utils.py
@@ -0,0 +1,46 @@
+from django.utils import translation
+from django.conf import settings
+
+LANGUAGES = getattr(settings, 'LANGUAGES', [])
+MIDDLEWARE_CLASSES = getattr(settings, 'MIDDLEWARE_CLASSES', ())
+MULTILINGUAL_URL = \
+ 'cms.middleware.multilingual.MultilingualURLMiddleware' \
+ in MIDDLEWARE_CLASSES
+
+def GetMultilanguageSitemapClass(sitemap, language):
+ """Wrap a Sitemap class within a language-aware class"""
+ class InnerClass(sitemap):
+ language = None
+
+ def __init__(self, *args, **kwargs):
+ self.language = language
+ super(InnerClass, self).__init__(*args, **kwargs)
+
+ def items(self, *args, **kwargs):
+ translation.activate(self.language)
+ result = super(InnerClass, self).items(*args, **kwargs)
+ translation.deactivate()
+ return result
+
+ def location(self, *args, **kwargs):
+ translation.activate(self.language)
+ url = super(InnerClass, self).location(*args, **kwargs)
+ translation.deactivate()
+ return '/%s%s' % (self.language, url)
+
+ return InnerClass
+
+def MakeMultilanguageSitemap(sitemaps):
+ """Takes a sitemap dict and modify it to hold the same sitemap classes
+ for every configured language"""
+ if not MULTILINGUAL_URL:
+ return sitemaps
+
+ for name, sitemap in sitemaps.items():
+ del sitemaps[name]
+ for lang in LANGUAGES:
+ new_name = '%s_%s' % (name, lang[0])
+ sitemaps[new_name] = GetMultilanguageSitemapClass(sitemap, lang[0])
+
+ return sitemaps
+
Something went wrong with that request. Please try again.