Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data provider generated with --locales all causes MissingLocale error #3631

Closed
z12315 opened this issue Jul 5, 2023 · 3 comments · Fixed by #3691
Closed

Data provider generated with --locales all causes MissingLocale error #3631

z12315 opened this issue Jul 5, 2023 · 3 comments · Fixed by #3691
Labels
C-data-infra Component: provider, datagen, fallback, adapters S-tiny Size: Less than an hour (trivial fixes) T-enhancement Type: Nice-to-have but not required

Comments

@z12315
Copy link

z12315 commented Jul 5, 2023

Hey,

I have generated a blob using:

icu4x-datagen \
    --keys all \
    --locales all \
    --format blob \
    --out data \
    --overwrite

I'm not able to run any locale-specific operation based on this blob generally running into "MissingLocale" or "MissingDataKey" errors. I'm not sure if I ran into a bug or if this is my fault.

I tried to reproduce the behavior in various examples taken from the docs(by replacing icu_testdata::any with a BlobDataProvider and using "de", "de_de" and "en" locales):

#[cfg(test)]
mod test {
    use std::cmp::Ordering;

    use icu::collator::{CollatorOptions, Collator, Strength};
    use icu::locid::{Locale, locale};
    use icu_provider::hello_world::HelloWorldFormatter;
    use icu_provider_blob::BlobDataProvider;

    const LOCALE: Locale = locale!("de_de"); // let's try some other language
    const DATA: &[u8] = std::include_bytes!("data");

    #[test]
    fn icu_compare_german() {
        fn compare(left: &str, right: &str) -> std::cmp::Ordering {
            let provider = BlobDataProvider::try_new_from_static_blob(DATA).unwrap();
        
            let options = CollatorOptions::new();
            let collator: Collator = Collator::try_new_with_buffer_provider(
                &provider,
                &LOCALE.into(),
                options,
            )
            .unwrap();
        
            collator.compare(left, right)
        }

        assert_eq!(compare("ä", "a"), Ordering::Equal);
    }

    #[test]
    fn icu_collator_example() {
        let provider = BlobDataProvider::try_new_from_static_blob(DATA).unwrap();

        let locale_es: Locale = locale!("es-u-co-trad");
        let mut options = CollatorOptions::new();
        options.strength = Some(Strength::Primary);
        let collator_es: Collator = Collator::try_new_with_buffer_provider(
            &provider,
            &locale_es.into(),
            options,
        )
        .unwrap();
        
        assert_eq!(collator_es.compare("pollo", "polvo"), Ordering::Greater);
    }

    #[test]
    fn icu_provider_blob_example() {
        let provider = BlobDataProvider::try_new_from_static_blob(DATA).unwrap();

        let formatter = HelloWorldFormatter::try_new_with_buffer_provider(
            &provider,
            &locale!("la").into(),
        )
        .expect("locale exists");

        assert_eq!(formatter.format().to_string(), "Ave, munde");
    }
}

full project to reproduce: https://github.com/z12315/icu4x_blob_provider_locale_error/tree/main

Thanks!

@sffc
Copy link
Member

sffc commented Jul 5, 2023

It's --locales full not --locales all. Note that all refers to the Allar language of the Kerala state of India.

@robertbastian
Copy link
Member

We should add a warning for --locales all.

@sffc sffc changed the title Using BlobDataProvider with icu4x_datagen data causes MissingLocale error Data provider generated with --locales all causes MissingLocale error Jul 5, 2023
@sffc sffc changed the title Data provider generated with --locales all causes MissingLocale error Data provider generated with --locales all causes MissingLocale error Jul 5, 2023
@sffc sffc added C-data-infra Component: provider, datagen, fallback, adapters S-tiny Size: Less than an hour (trivial fixes) T-enhancement Type: Nice-to-have but not required labels Jul 5, 2023
@z12315
Copy link
Author

z12315 commented Jul 5, 2023

Embarrassing. Thanks a lot! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-data-infra Component: provider, datagen, fallback, adapters S-tiny Size: Less than an hour (trivial fixes) T-enhancement Type: Nice-to-have but not required
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants