Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#47: Adding support for localized faker provider #48

Merged
merged 13 commits into from
Nov 29, 2022

Conversation

hkage
Copy link
Contributor

@hkage hkage commented Nov 25, 2022

This PR adds support for Faker's localized provider. This allows the usage of generator methods that will be localized for one or more locales or special generator methods, that are only available for localized providers, like VAT-IDs.

Due to the fact, that the Faker instance needs to be initialized with one or more locales and initializing the Faker instance on data row level would lead into a massive performance issue, the locale has to be set for the whole Faker context - that means the locale will be used for the whole anonymization process. Therefore it will be added as a separate option within the YAML schema, like this:

tables:
  - address:
      fields:
        - first_name:
            provider:
              name: faker.first_name
        - vat_id:
            provider:
              name: faker.vat_id
options:
  faker:
    locales:
      - de_DE

@BuddhaOhneHals
Copy link

The support for locales on field level can be supported without initializing the Faker instance on data row level. The documentation suggests that you can access each locale you previously provided like that: fake['de_DE'].name().

So it would be possible to support something like that:

tables:
  - address:
      fields:
        - first_name:
            provider:
              name: faker.en_US.first_name
        - vat_id:
            provider:
              name: faker.de_DE.vat_id

options:
  faker:
    locales:
      - de_DE
      - en_US

or

tables:
  - address:
      fields:
        - first_name:
            provider:
              name: faker.first_name
              locale: en_US
        - vat_id:
            provider:
              name: faker.vat_id

options:
  faker:
    locales:
      - de_DE
      - en_US

What do you think?

@BuddhaOhneHals
Copy link

I added the support for defining locales on field level and introduced a default_locale option.

Full example:

    tables:
      - user:
          primary_key: id
          fields:
            - name:
                provider:
                  # No locale entry at all, use configured default_locale "de_DE"
                  name: fake.name
            - city:
                provider:
                  # Use "en_US"
                  name: fake.city
                  locale: en_US
            - street:
                provider:
                  # Use "cs_CZ"
                  name: fake.street_address
                  locale: cs_CZ
            - zipcode:
                provider:
                  # Use empty locale to ignore default_locale and to randomly select locale
                  name: fake.postcode
                  locale:

    options:
      faker:
        locales:
          - de_DE
          - en_US
          - cs_CZ
        default_locale: de_DE

@BuddhaOhneHals BuddhaOhneHals linked an issue Nov 29, 2022 that may be closed by this pull request
@BuddhaOhneHals BuddhaOhneHals marked this pull request as ready for review November 29, 2022 11:24
@hkage
Copy link
Contributor Author

hkage commented Nov 29, 2022

LGTM 👍

@hkage hkage merged commit 5226982 into development Nov 29, 2022
@hkage hkage deleted the feature/localized-providers branch November 29, 2022 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to use faker’s localized providers?
2 participants