-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding source data versions to baked crates #4399
Conversation
provider/core/src/datagen/mod.rs
Outdated
impl<T> ExportableProvider for T where | ||
T: IterableDynamicDataProvider<ExportMarker> + DynamicDataProvider<AnyMarker> + Sync | ||
{ | ||
/// Returns the source versions of the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks semver but I don't know how else I would get this data from the source provider. I really don't want to define a data key for this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the function to the trait with a default impl doesn't break semver, but removing the blanket impl does. Can you use runtime specialization in the blanket impl to sniff for the known concrete impl that has a source_versions function? Well I guess that requires making T: 'static
which is also semver. I lean toward just deleting the blanket impl at least to get the PR building and then we can escalate to the team.
provider/datagen/src/provider.rs
Outdated
tag.strip_prefix("release-") | ||
.unwrap_or(tag) | ||
.replace('-', ".") | ||
.leak(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: often we use pre-release tags. Maybe just use the whole tag name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GH tags don't match the version in the ICU repo, so I have to process them somehow. release-75-rc
will become 75.rc
which I guess is fine. I do want release-75-1
to become 75.1
, because that's the version that we would read from the files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just read it from the files?
I'm thinking about the tags like icu4x/2023-12-01/73.1
which I'd like to not need to parse. I'm okayish with bubbling them up to users, though I agree having an actual version number seems more useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just read it from the files?
Because then 10 different tags will be 74.1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we save the tag string in SourceVersions
and then if we want we can add an instance method to SourceVersions
that parses the ICU version out of the tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about you let me parse release-xx-yy
to xx.yy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a more consistent mental model if SourceVersions::Public
always contains the tag string.
If you want to parse the release version into SourceVersions::Public
, you could make a new variant SourceVersions::Tagged
for arbitrary tags that aren't release tags (including hashes, etc)?
provider/core/src/versions.rs
Outdated
/// A concrete version | ||
pub enum SourceVersion<'data> { | ||
/// A released version | ||
Public(#[cfg_attr(feature = "serde", serde(borrow))] &'data str), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: use a Cow so we don't leak strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't have cows in consts because of dropping stuff.
provider/core/src/versions.rs
Outdated
pub struct SourceVersions<'data> { | ||
#[cfg_attr(feature = "serde", serde(borrow))] | ||
/// The version of CLDR source data, if used during datagen. | ||
pub cldr: Option<SourceVersion<'data>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observation/Issue: icu_provider has no concept of CLDR. I'd like to keep it agnostic if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally I think it's important to have this but it's not a strong opinion
provider/core/src/datagen/mod.rs
Outdated
impl<T> ExportableProvider for T where | ||
T: IterableDynamicDataProvider<ExportMarker> + DynamicDataProvider<AnyMarker> + Sync | ||
{ | ||
/// Returns the source versions of the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the function to the trait with a default impl doesn't break semver, but removing the blanket impl does. Can you use runtime specialization in the blanket impl to sniff for the known concrete impl that has a source_versions function? Well I guess that requires making T: 'static
which is also semver. I lean toward just deleting the blanket impl at least to get the PR building and then we can escalate to the team.
Oh and thanks for this PR! |
provider/core/src/versions.rs
Outdated
} | ||
|
||
#[cfg(feature = "datagen")] | ||
impl databake::Bake for SourceVersions<'_> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: ideally would be very cool if e.g. the SourceVersions for properties did not mention a cldr version (since it's irrelevant)
but that seems to be more hassle than it's worth to track
Consensus:
[metadata.sources]
# some free structure chosen by datagen
cldr = { tagged = "44.0.0" }
icuexport = { custom = "74.0" }
|
d9ca255
to
f7bf8df
Compare
provider/baked/_template_/Cargo.toml
Outdated
@@ -15,3 +15,8 @@ homepage.workspace = true | |||
include.workspace = true | |||
repository.workspace = true | |||
rust-version.workspace = true | |||
|
|||
[metadata.sources] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think it should be [package.metadata.sources]
https://doc.rust-lang.org/cargo/reference/manifest.html#the-metadata-table
.replace("_version_", version), | ||
.replace("_version_", version) | ||
.replace("_cldr_tag_", DatagenProvider::LATEST_TESTED_CLDR_TAG) | ||
.replace( | ||
"_icuexport_tag_", | ||
DatagenProvider::LATEST_TESTED_ICUEXPORT_TAG, | ||
) | ||
.replace( | ||
"_segmenter_lstm_tag_", | ||
DatagenProvider::LATEST_TESTED_SEGMENTER_LSTM_TAG, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Praise: This is, like... a super clean and simple way to do things. Good suggestion 👍
#4342