Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ergonomic API & Data Providers #30

Open
nciric opened this issue Apr 15, 2020 · 5 comments
Open

Ergonomic API & Data Providers #30

nciric opened this issue Apr 15, 2020 · 5 comments
Assignees
Labels
A-design Area: Architecture or design C-data-infra Component: provider, datagen, fallback, adapters question Unresolved questions; type unclear T-docs-tests Type: Code change outside core library

Comments

@nciric
Copy link
Contributor

nciric commented Apr 15, 2020

Based on pull request #28 I would like to discuss ways we can deal with data providers and end client API (in document referred as ergonomic API).

I feel that average developer shouldn't care where the data comes from, but should be aware of async nature of the request, as long as project as a whole can set it up for them. Think about Chrome, where Browser/Renderer processes set up data to be fetched from disk or if missing from a service. Ordinary developer wouldn't need to make that decision on every point of interaction with our API.

A similar approach to what @zbraniecki proposed for caching can be applied to data providers. We can have a simple DataProviderCache object, that's globally available to all constructors/methods. I don't expect that a single instance of our library will have more than handful different providers (if that), so cache would be fairly small.

An example of DataProviderCache initialization:

data_provider_cache = DataProviderCache()
data_provider_cache.insert('static_data', static_provider[, preference_level_0])
data_provider_cache.insert('aws_data', aws_provider[, preference_level_1])
data_provider_cache.insert('slow_data', slow_provider[, preference_level_2])
...

Preference level was added in case two providers can supply the same data set, but potentially with higher cost to speed, dollar amount etc.

Each data provider would know which locale it can handle, and data it can provide for each. It would also be able to tell if it already has that data so new fetch is not necessary.

Our ergonomic API in that case would be in a shape of:

Intl.NumberFormat(locale, options)

or if we want to enable developers to enforce specific data sources:

Intl.NumberFormat(locale, options, ['static_data', 'aws_data'])
@nciric
Copy link
Contributor Author

nciric commented Apr 15, 2020

@hagbard to discussion, I know he had some thoughts about this.

@sffc sffc added the T-docs-tests Type: Code change outside core library label Apr 16, 2020
@sffc
Copy link
Member

sffc commented Apr 16, 2020

I feel that average developer shouldn't care where the data comes from

The developer, should, at some point in time, make some kind of conscious decision about where to load the data from: in-memory, data file, service, operating system, etc. The decision could be made when installing ICU4X, when writing code using ICU4X, or at some other point in the lifecycle.

Locale data is such a fundamental piece of i18n infrastructure that we would be doing a disservice by hiding it under the hood, which is largely what ICU4C and especially ICU4J tend to do.

but should be aware of async nature of the request, as long as project as a whole can set it up for them.

I don't understand what you mean by "as long as project as a whole can set it up for them".

Think about Chrome, where Browser/Renderer processes set up data to be fetched from disk or if missing from a service. Ordinary developer wouldn't need to make that decision on every point of interaction with our API.

Just to make sure we're on the same page, is it okay in your opinion if we make the developer "await" objects? That's the point of view I have been taking since I first started discussing this in tc39/ecma402#210.

A similar approach to what @zbraniecki proposed for caching can be applied to data providers. We can have a simple DataProviderCache object, that's globally available to all constructors/methods. I don't expect that a single instance of our library will have more than handful different providers (if that), so cache would be fairly small.

The word "cache" is misleading here, because, if I understand correctly, it appears that this object doesn't actually cache any data. A more appropriate name would be "registry".

However, I actually see no reason for a registry given a flexible data provider trait. If it's too ugly to pass a data provider into every constructor, a single default data provider can be provided in global state.

Preference level was added in case two providers can supply the same data set, but potentially with higher cost to speed, dollar amount etc.

The mechanics of "preference level" would be handled by a userland forking data provider according to my proposal in data-pipeline.md.

or if we want to enable developers to enforce specific data sources:

Intl.NumberFormat(locale, options, ['static_data', 'aws_data'])

I would rather have this decision made in a custom userland data provider.

@hagbard
Copy link
Contributor

hagbard commented Apr 16, 2020 via email

@sffc sffc self-assigned this Apr 17, 2020
@sffc sffc added C-process Component: Team processes A-design Area: Architecture or design C-data-infra Component: provider, datagen, fallback, adapters and removed C-process Component: Team processes labels May 7, 2020
@sffc sffc added this to the 2020 Q2 milestone Jun 17, 2020
@sffc
Copy link
Member

sffc commented Jun 23, 2020

I don't see what is immediately actionable on this issue. We currently have two Markdown files that discuss the subjects of data provider and ergonomic API. I am putting it on the backlog to revisit before v1 to make sure we end up with something consistent with what @nciric wrote in the OP.

@sffc sffc closed this as completed Jun 23, 2020
@sffc sffc removed this from the 2020 Q2 milestone Jun 23, 2020
@sffc sffc reopened this Sep 4, 2020
@sffc sffc added the question Unresolved questions; type unclear label Apr 3, 2021
@sffc
Copy link
Member

sffc commented Apr 1, 2022

Here is a doc explaining async data providers:

https://docs.google.com/document/d/1haiE_XsYpyDGNpAKTZWhRwU0TU-OjDURDiZtIVjYkCk/edit

Let's make a full-stack async provider in scope for 1.1.

@sffc sffc added this to the ICU4X 1.1 milestone Apr 1, 2022
@sffc sffc removed backlog labels Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design Area: Architecture or design C-data-infra Component: provider, datagen, fallback, adapters question Unresolved questions; type unclear T-docs-tests Type: Code change outside core library
Projects
None yet
Development

No branches or pull requests

3 participants