Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sitemap hreflang syntax invalid for regional language variants fix #5638

Merged
merged 3 commits into from May 13, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
12 changes: 6 additions & 6 deletions readthedocs/projects/constants.py
Expand Up @@ -250,12 +250,12 @@
('zh', 'Chinese'),
('zu', 'Zulu'),
# Try these to test our non-2 letter language support
('nb_NO', 'Norwegian Bokmal'),
('pt_BR', 'Brazilian Portuguese'),
('es_MX', 'Mexican Spanish'),
('uk_UA', 'Ukrainian'),
('zh_CN', 'Simplified Chinese'),
('zh_TW', 'Traditional Chinese'),
('nb-NO', 'Norwegian Bokmal'),
('pt-BR', 'Brazilian Portuguese'),
('es-MX', 'Mexican Spanish'),
('uk-UA', 'Ukrainian'),
('zh-CN', 'Simplified Chinese'),
('zh-TW', 'Traditional Chinese'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can change this.

LANGUAGES is used for URLs, and by changing this constant we will be breaking many URLs. Also, these constant matches the language= setting from Sphinx.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@humitos we can add a method to sitemap_xml view to change the lang if its not in the correct format. somthing like if lang == 'zh_CN': return 'zh-CN'. It would be a hacky process though. What do you think we can do here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds hacky, but good enough to me considering that it's what we need and it's encapsulated inside the sitemap_xml.

Is the only thing that needs to be replaced _ by -? Did you check the sitemap reference?

What about the rest of the values? are they valid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@humitos yes, we only need to replaced _ by - . Updated the PR please have a look.

)

LANGUAGES_REGEX = '|'.join([re.escape(code[0]) for code in LANGUAGES])
Expand Down
32 changes: 32 additions & 0 deletions readthedocs/projects/migrations/0043_change-language-code.py
@@ -0,0 +1,32 @@
# -*- coding: utf-8 -*-
# Generated by Django 1.11.20 on 2019-04-27 21:05
from __future__ import unicode_literals

from django.db import migrations, models


def change_language_code(apps, schema_editor):
Project = apps.get_model('projects', 'Project')

Project.objects.filter(language='nb_NO').update(language='nb-NO')
Project.objects.filter(language='pt_BR').update(language='pt-BR')
Project.objects.filter(language='es_MX').update(language='es-MX')
Project.objects.filter(language='uk_UA').update(language='uk-UA')
Project.objects.filter(language='zh_CN').update(language='zh-CN')
Project.objects.filter(language='zh_TW').update(language='zh-TW')


class Migration(migrations.Migration):

dependencies = [
('projects', '0042_increase_env_variable_value_max_length'),
]

operations = [
migrations.RunPython(change_language_code),
migrations.AlterField(
model_name='project',
name='language',
field=models.CharField(choices=[('aa', 'Afar'), ('ab', 'Abkhaz'), ('acr', 'Achi'), ('af', 'Afrikaans'), ('agu', 'Awakateko'), ('am', 'Amharic'), ('ar', 'Arabic'), ('as', 'Assamese'), ('ay', 'Aymara'), ('az', 'Azerbaijani'), ('ba', 'Bashkir'), ('be', 'Belarusian'), ('bg', 'Bulgarian'), ('bh', 'Bihari'), ('bi', 'Bislama'), ('bn', 'Bengali'), ('bo', 'Tibetan'), ('br', 'Breton'), ('ca', 'Catalan'), ('caa', "Ch'orti'"), ('cac', 'Chuj'), ('cab', 'Garífuna'), ('cak', 'Kaqchikel'), ('co', 'Corsican'), ('cs', 'Czech'), ('cy', 'Welsh'), ('da', 'Danish'), ('de', 'German'), ('dz', 'Dzongkha'), ('el', 'Greek'), ('en', 'English'), ('eo', 'Esperanto'), ('es', 'Spanish'), ('et', 'Estonian'), ('eu', 'Basque'), ('fa', 'Iranian'), ('fi', 'Finnish'), ('fj', 'Fijian'), ('fo', 'Faroese'), ('fr', 'French'), ('fy', 'Western Frisian'), ('ga', 'Irish'), ('gd', 'Scottish Gaelic'), ('gl', 'Galician'), ('gn', 'Guarani'), ('gu', 'Gujarati'), ('ha', 'Hausa'), ('hi', 'Hindi'), ('he', 'Hebrew'), ('hr', 'Croatian'), ('hu', 'Hungarian'), ('hy', 'Armenian'), ('ia', 'Interlingua'), ('id', 'Indonesian'), ('ie', 'Interlingue'), ('ik', 'Inupiaq'), ('is', 'Icelandic'), ('it', 'Italian'), ('itz', "Itza'"), ('iu', 'Inuktitut'), ('ixl', 'Ixil'), ('ja', 'Japanese'), ('jac', "Popti'"), ('jv', 'Javanese'), ('ka', 'Georgian'), ('kjb', "Q'anjob'al"), ('kek', "Q'eqchi'"), ('kk', 'Kazakh'), ('kl', 'Kalaallisut'), ('km', 'Khmer'), ('kn', 'Kannada'), ('knj', 'Akateko'), ('ko', 'Korean'), ('ks', 'Kashmiri'), ('ku', 'Kurdish'), ('ky', 'Kyrgyz'), ('la', 'Latin'), ('ln', 'Lingala'), ('lo', 'Lao'), ('lt', 'Lithuanian'), ('lv', 'Latvian'), ('mam', 'Mam'), ('mg', 'Malagasy'), ('mi', 'Maori'), ('mk', 'Macedonian'), ('ml', 'Malayalam'), ('mn', 'Mongolian'), ('mop', 'Mopan'), ('mr', 'Marathi'), ('ms', 'Malay'), ('mt', 'Maltese'), ('my', 'Burmese'), ('na', 'Nauru'), ('ne', 'Nepali'), ('nl', 'Dutch'), ('no', 'Norwegian'), ('oc', 'Occitan'), ('om', 'Oromo'), ('or', 'Oriya'), ('pa', 'Panjabi'), ('pl', 'Polish'), ('pnb', 'Western Punjabi'), ('poc', 'Poqomam'), ('poh', 'Poqomchi'), ('ps', 'Pashto'), ('pt', 'Portuguese'), ('qu', 'Quechua'), ('quc', "K'iche'"), ('qum', 'Sipakapense'), ('quv', 'Sakapulteko'), ('rm', 'Romansh'), ('rn', 'Kirundi'), ('ro', 'Romanian'), ('ru', 'Russian'), ('rw', 'Kinyarwanda'), ('sa', 'Sanskrit'), ('sd', 'Sindhi'), ('sg', 'Sango'), ('si', 'Sinhala'), ('sk', 'Slovak'), ('skr', 'Saraiki'), ('sl', 'Slovenian'), ('sm', 'Samoan'), ('sn', 'Shona'), ('so', 'Somali'), ('sq', 'Albanian'), ('sr', 'Serbian'), ('ss', 'Swati'), ('st', 'Southern Sotho'), ('su', 'Sudanese'), ('sv', 'Swedish'), ('sw', 'Swahili'), ('ta', 'Tamil'), ('te', 'Telugu'), ('tg', 'Tajik'), ('th', 'Thai'), ('ti', 'Tigrinya'), ('tk', 'Turkmen'), ('tl', 'Tagalog'), ('tn', 'Tswana'), ('to', 'Tonga'), ('tr', 'Turkish'), ('ts', 'Tsonga'), ('tt', 'Tatar'), ('ttc', 'Tektiteko'), ('tzj', "Tz'utujil"), ('tw', 'Twi'), ('ug', 'Uyghur'), ('uk', 'Ukrainian'), ('ur', 'Urdu'), ('usp', 'Uspanteko'), ('uz', 'Uzbek'), ('vi', 'Vietnamese'), ('vo', 'Volapuk'), ('wo', 'Wolof'), ('xh', 'Xhosa'), ('xin', 'Xinka'), ('yi', 'Yiddish'), ('yo', 'Yoruba'), ('za', 'Zhuang'), ('zh', 'Chinese'), ('zu', 'Zulu'), ('nb-NO', 'Norwegian Bokmal'), ('pt-BR', 'Brazilian Portuguese'), ('es-MX', 'Mexican Spanish'), ('uk-UA', 'Ukrainian'), ('zh-CN', 'Simplified Chinese'), ('zh-TW', 'Traditional Chinese')], default='en', help_text="The language the project documentation is rendered in. Note: this affects your project's URL.", max_length=20, verbose_name='Language'),
),
]