# Setting the locale in a Google Colab notebook

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/enabling-languages/python-i18n/blob/main/colab/locale_module_colab.ipynb)

## Overview

With Google Colab, it is necessary to distinguish between the locale used by the Colab user interface, and the locale used by python when executing code. The Colab user interface uses the language settings in your web browser. The Python environment uses the locale of the initialised runtime environment the notebook is running in.

## Configuring Google Colab

When a runtime is initialised there is limited locale support. The available locales are:

* `C`
* `C.UTF-8`
* `en_US.utf8`
* `POSIX`

The runtime environment will be using `en_US.utf8`. If you require your notebook to use a different locale, it is necessary to use the `locale` module at the beginning of the notebook. The locale should not be subsequently changed. If you require multiple locales during code execution, or need code to be flexible about the locale being used, it is best to use `PyICU`.

To use a different locale in a notebook, it is necesary to:

1. install the necessary language packs, and
2. restart the runtime environment.

The notebook contains example code that will try to set the required locale, and if the locale is unavailable, the code will install the language pack required for this notebook, and then restrt the runtime environment.

It will be necessary to manually rerun the notebook.

In [1]:
import platform
platform.platform()

'Linux-5.4.144+-x86_64-with-Ubuntu-18.04-bionic'

When this notebook was authoured, the Colab runtime environment was based on Ubuntu 18.04 LTS (Bionic).

Refer to the [list of lanaguage packs](https://packages.ubuntu.com/bionic/translations/) available in the Ubuntu Package archive.

In [2]:
# List all current 
!locale -a 

am_ET
am_ET.utf8
C
C.UTF-8
en_AU.utf8
en_US.utf8
POSIX


Once a locale is installed, it is necessary to restart the runtime environment and rerun the notebook code cells. When this occurs `locale -a` reflects the changes. When thecommand first runs, the list of locales is `C`, `C.UTF-8`, `en_US.utf8`, and `POSIX`. When the notebook is first rerun `en_AU.utf8` is added. On the final rerun, `am_ET` and `am_ET.utf8` are added.

The file `/usr/share/i18n/SUPPORTED` lists all the supported locales that you can use with Google Colab.

In [9]:
import re

# Generate a list of all UTF-8 locales supported in current Google Colab runtime
supported = "/usr/share/i18n/SUPPORTED"
supported_locales = []
with open(supported) as fp:
    for line in fp:
        l = line.split()
        if re.search(r'UTF\-8',l[0]):
            supported_locales.append(l[0])
        elif not re.search(r'UTF\-8', l[0]) and l[1] == "UTF-8":
            supported_locales.append((".").join(l))

# All supported locales in current Google Colab runtime
print(str(supported_locales))

['aa_DJ.UTF-8', 'aa_ER.UTF-8', 'aa_ER@saaho.UTF-8', 'aa_ET.UTF-8', 'af_ZA.UTF-8', 'agr_PE.UTF-8', 'ak_GH.UTF-8', 'am_ET.UTF-8', 'an_ES.UTF-8', 'anp_IN.UTF-8', 'ar_AE.UTF-8', 'ar_BH.UTF-8', 'ar_DZ.UTF-8', 'ar_EG.UTF-8', 'ar_IN.UTF-8', 'ar_IQ.UTF-8', 'ar_JO.UTF-8', 'ar_KW.UTF-8', 'ar_LB.UTF-8', 'ar_LY.UTF-8', 'ar_MA.UTF-8', 'ar_OM.UTF-8', 'ar_QA.UTF-8', 'ar_SA.UTF-8', 'ar_SD.UTF-8', 'ar_SS.UTF-8', 'ar_SY.UTF-8', 'ar_TN.UTF-8', 'ar_YE.UTF-8', 'ayc_PE.UTF-8', 'az_AZ.UTF-8', 'az_IR.UTF-8', 'as_IN.UTF-8', 'ast_ES.UTF-8', 'be_BY.UTF-8', 'be_BY@latin.UTF-8', 'bem_ZM.UTF-8', 'ber_DZ.UTF-8', 'ber_MA.UTF-8', 'bg_BG.UTF-8', 'bhb_IN.UTF-8', 'bho_IN.UTF-8', 'bho_NP.UTF-8', 'bi_VU.UTF-8', 'bn_BD.UTF-8', 'bn_IN.UTF-8', 'bo_CN.UTF-8', 'bo_IN.UTF-8', 'br_FR.UTF-8', 'brx_IN.UTF-8', 'bs_BA.UTF-8', 'byn_ER.UTF-8', 'ca_AD.UTF-8', 'ca_ES.UTF-8', 'ca_ES@valencia.UTF-8', 'ca_FR.UTF-8', 'ca_IT.UTF-8', 'ce_RU.UTF-8', 'ckb_IQ.UTF-8', 'chr_US.UTF-8', 'cmn_TW.UTF-8', 'crh_UA.UTF-8', 'cs_CZ.UTF-8', 'csb_PL.UTF-8', '

In [4]:
# Return list of UTF-8 locales that support specified language
def matching_locales(lang):
  r = re.compile(r'^'+lang+'_.+')
  return list(filter(r.match, supported_locales))

#lang = input("Language code: ")
lang = "fr"
print(matching_locales(lang))

['fr_BE.UTF-8', 'fr_CA.UTF-8', 'fr_CH.UTF-8', 'fr_FR.UTF-8', 'fr_LU.UTF-8']


To identify the default system locale used by Python, use [locale.getdefaultlocale()](https://docs.python.org/3/library/locale.html#locale.getdefaultlocale).

In [5]:
import locale
default_locale = locale.getdefaultlocale()
default_locale

('en_US', 'UTF-8')

There are two approaches to installing additional locales on Google Colab:

1. Install the appropriate language pack
2. Use `locale-gen` and `update-locale`

The second approach is probably best for Google Colab. The first will install syetm localisation files as well as the locales, while the second approach will just install the locales required.

## Using language packs

The following code cell, when running on Google Colab, will set the locale to English (Australia). If locale is unavailable, it will install the language pack required by the notebook then restart the Colab runtime environment. After the runtime environment is reset, it will be necessary to rerun the code in the notebook.

In [6]:
try:
  import google.colab
  IN_COLAB = True
except ImportError:
  IN_COLAB = False

if IN_COLAB:
  try:
    locale.setlocale(locale.LC_ALL, "en_AU.UTF-8")
  except locale.Error:
    print("Currently installing required locale. After the install the runtime environment will restart, please rerun all code.")
    !sudo apt-get install language-pack-en
    import os
    os.kill(os.getpid(), 9)
else:
    try:
        locale.setlocale(locale.LC_ALL, "en_AU.UTF-8")
    except locale.Error:
        locale.setlocale(locale.LC_ALL, "")

In [7]:
# Current locale
locale.getlocale()

('en_AU', 'UTF-8')

## Using `locale-gen` and `update-locale`

Alternatively, it is possible tp use `locale-gen` and `update-locale` to install the necessary locale.

In [8]:
#loc = input('Enter required locale: ')
loc = "am_ET.UTF-8"

# If locale is supported, set to locale or install locale and restsrt runtime.
# If locale is unsupported fallback to default locale.
if loc in supported_locales:
  try:
    locale.setlocale(locale.LC_ALL, loc)
  except locale.Error:
    print("After the missing locale is installed, the runtime environment will restart, please rerun all code.")
    !sudo locale-gen {loc}
    !sudo update-locale
    import os
    os.kill(os.getpid(), 9)
else:
  locale.setlocale(locale.LC_ALL, '')

print("Current locale: ", locale.getlocale())

Current locale:  ('am_ET', 'UTF-8')


This snippet also shows how to pass a python variable to a system command in the runtime environment, by placing the varable name in braces, e.g. `{loc}`.

## Notes

_When adding locales you are adding system resources. In order to make them available to Python it is necessary to restart the runtime environement and rerun the notebook code._

_The locale should be set near the beginning of the notebook, and should not be changed._

<div style="text-align: center; text-style: italic;">© 2021-2022 <a href="https://enabling-languages.github.io/">Enabling Languages</a>. <br>
Released under the <a href="https://github.com/enabling-languages/python-i18n/blob/main/LICENSE">MIT license</a>.</div>