-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract language definitions into data file #699
Conversation
This changes the data format from JSON to YAML to be easier to use and embed multiline strings. It then uses the YAML to fill in a template Rust file with relevant values.
Was part of temporary changes when generating JSON export.
It is no longer needed, as test function names are now generated with templates instead of macros.
There will likely be more changes needed, but I think this is at least ready for an initial review. |
These types of things always make we wonder if the data files should be discoverable by linguist, since in this case it's used in code generation. |
I think we're going to need to add some integration tests at some point. This PR shouldn't change any behavior at all, but right we basically ensure that by checking if it the output looks good. I automated the bulk of this with a quick, lazy, buggy script, so I don't have 100% confidence that there weren't some unintended changes to logos. It definitely mangled the Haskell logo, which I had to fix. |
languages.yaml
Outdated
- [255, 255, 255] | ||
- [0, 24, 201] | ||
- [12, 10, 124] | ||
chip: [2, 248, 140] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chip: [2, 248, 140] | |
chip: Rgb[2, 248, 140] |
We should maybe keep the Rgb
prefix everywhere to make it more self explanatory and obvious that we're talking about colors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that valid YAML? AFAIK that would deserialize into a string, which would require some additional parsing in addition to deserialization in build.rs
.
It could perhaps be an object, like
chip:
r: 2
g: 248
b: 140
but I was trying to reduce the length of the config while maintaining human readability. Or maybe change the key to chip-rgb
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes or chip-color
. Too bad owo_colors
doesn't support hex colors 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could register a hex-to-rgb filter in the build script.
What do you think about this?
ansi:
- red
- green
- blue
rgb: # maybe another name, "hex" or "true" or "fullcolor"?
- '#FF0000' # mainly just for consistency with chip, so we aren't mixing RGB and hex
- '#00FF00'
- '#0000FF'
chip: '#02F88C' # sadly # is a comment, so it needs quotes -_-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea! we're getting closer to the GitHub linguist format: https://github.com/github/linguist/blob/master/lib/linguist/languages.yml
hex
sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
templates/language.rs
Outdated
{% for language, attrs in languages -%} | ||
{% if attrs.colors.rgb %} | ||
{% set ansi_length = attrs.colors.ansi | length %} | ||
{% set rgb_length = attrs.colors.rgb | length %} | ||
{% if ansi_length != rgb_length %} | ||
compile_error!("{{ language }}: ansi and rgb colors must be the same length"); | ||
{% endif %} | ||
{% endif %} | ||
{% endfor %} | ||
|
||
{% set max_width = 40 %} | ||
{# NOTE Permitting trailing newline #} | ||
{% set max_height = 26 %} | ||
|
||
|
||
{% for language, attrs in languages -%} | ||
{% set lines = attrs.ascii | split(pat="\n") %} | ||
{% set height = lines | length %} | ||
{% if height > max_height %} | ||
compile_error!("{{ language }}: ascii art must have {{ max_height - 1 }} or less lines, has {{ height }}"); | ||
{% endif %} | ||
|
||
{% for line in lines %} | ||
{% set cleaned_line = line | strip_color_indices %} | ||
{% set width = cleaned_line | length %} | ||
{% if width > max_width %} | ||
compile_error!("{{ language }}: ascii art line {{ loop.index }} must be {{ max_width }} or less characters wide"); | ||
{% endif %} | ||
{% endfor %} | ||
{% endfor %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why but if you look at the generated language.rs
by tera in the /target/debut/onefetch-*/out folder, this part of the template results in thousands of empty lines.
Besisdes, IMO these should be unit tests not compilation errors. We should maybe find a way to put them back in the original language.rs
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the empty lines are from the if
statements never being entered. E.g.
{# 3 empty lines #}
{% for n in [1,2,3] %}
{% if n > 4 %}println!("Hello!"){% endif %}
{% endfor %}
IMO we don't need to worry too much about how pretty the generated code is: generated code isn't really intended to be human-readable.
For compile errors vs tests, the benefit of compile errors is that you can't even run onefetch if the values are invalid. Additionally, IMO it's preferable to assert the validity of values at compile-time instead of a runtime (in this case the test runtime). IMO tests should really be for behavior, not validating internal values. As a reductionist example, it's kind of like doing this:
/// A number between 0 and 255
const SMALL_NUMBER: usize = make_number!();
#[test]
fn number_is_small() {
assert!(SMALL_NUMBER < 256);
}
instead of this:
/// A number between 0 and 255
const SMALL_NUMBER: u8 = make_number!();
Making them compile errors also should result in a faster CI failure, since the presence of a compile_error!
macro in the generated code will make compilation fail pretty quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed 👍
Yeah, the empty lines are from the if statements never being entered. E.g.
Maybe add some -%
to reduce the clutter.
Very impressive @spenserblack 😮 I left a few comments, but I'll finish the review over the week end 👍 |
Co-authored-by: Ossama Hjaji <ossama-hjaji@live.fr>
Co-authored-by: Ossama Hjaji <ossama-hjaji@live.fr>
… into dev/json-languages
I pushed a few changes (I hope you don't mind). @spenserblack Do you think we can merge the PR as is? Or do we wait until we've added unit/integration tests? |
I don't mind at all, please and thanks!
I'm definitely not opposed to adding some tests, but I looped through all of the languages and they all looked good. Looks like you already fixed the logos in commits like 3e759a8, thanks! Given that the only thing that's "public" is the binary itself and its output to the terminal, I was thinking we could start with testing several variations of TBH I was thinking of putting off these types of tests until Hacktoberfest, because it would be pretty repetitive 😅 . IMO we'd want 2 tests for each ASCII logo, both basic and true color, to guarantee stability. At this point, I think the output is stable enough that we'd want tests to guarantee stability during refactors like this. I'm focusing mainly on logos just because of this PR, but we'd probably want to test the stats output, too. And with that comes variations of the git repository being analyzed, so that would come with testing different repositories (either real repositories or by mocking gitoxide). We might also want to test the CLI itself. While working on #685 I was surprised by a few panics. Tests could ensure that our CLI that we put a lot of work in doesn't break with clap updates, or for any other reason. So that was a lot of words to say that, because this is a binary with no library, if our tests guarantee stability of input via the CLI and output to the terminal, I think we're good. |
Glad to see you've already put some thoughts into it. IMO, we won't be able to skip on tests for much longer as more people seem to be playing with this tool. Maybe we can merge this PR as is to be able to work on #696 and create a separate issue on that matter as a milestone for the next release. |
This uses a templating tool, Tera, to read the language data and generate the appropriate Rust code,
instead of using a macro. Some of the benefits of this:
matching basic and true color lengths are now enforced with a compiler macro instead of a test runtime.
There are some unnecessary commits in this PR: making a temporary onefetch lib implementation,
making a temp binary to export onefetch data to JSON, etc. That doesn't need to stay in this PR
and can be rebased away, but I thought it was useful to preserve the history in some form.
For #696
To do
CONTRIBUTING.md