Skip to content

[Exercise] implement variable-length-quantity exercise #1053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 25, 2025

Conversation

Wiguwbe
Copy link
Contributor

@Wiguwbe Wiguwbe commented May 24, 2025

Hello,

A few days ago I was trying to create a encoder/decoder for SQLite's Varints. Instead of implementing the tests by hand, I searched for already implemented tests.

I stumbled upon Exercism and the VLQ exercise in Python. As it was not implemented in C, I decided to port it. Hopefully someone will find it fun/useful as well.

Note that this is my first time contributing to (or even using) Exercism.

The test generation was automated using the canonical-data.json as input. It's a relatively specific script (and implemented in Perl), but I'm attaching here still:
gen-tests.pl.gz (gzip-ed so that github accepts it)

I have a couple questions for the reviewers:

  • The example.c may no be very reader-friendly. Is it expected to be read by someone or is it just to test the tests?
  • I'm using uint64_t for the decoded integers, but it's not expected to have integers bigger than 32-bits. Is it preferable to use 32 bit integers?

@siebenschlaefer
Copy link
Contributor

Hi,

thank you for translating this exercise to C! I just noticed a few things:

  • The include guard is VLQ_H.
    In all other exercises it's the name of the directory, in uppercase, hyphens replaced by underscores, with _H at the end (e.g. RUN_LENGTH_ENCODING_H in the run-length-encoding exercise).
    I'd suggest using VARIABLE_LENGTH_QUANTITY_H instead of VLQ_H.
  • The comments in the .h file explain the return value of the two functions.
    Usually when a .h file contains declarations of the functions with comments, they are similar to "docstrings" (e.g. in the exercises anagram, high-scores, linked-list, series, sieve). They explain what the function does, not only one aspect of it. The only exception so far with comments similar to this one is in the react exercise.
    Instead of just explaining the result I'd prefer a more complete comment that reads as if it were directed to the users of the function. Or, if this comment is meant for the students, move it to the .c file.
  • The first parameter of both functions in the stubs is a uint64_t *.
    If we want to provide declarations of the two functions in the .h file I'd prefer the first parameter being a pointer to const. I'd argue that this is best practice for pointers to read-only arguments.
  • The second parameter of both functions in the stubs is an int.
    In most other exercises we use size_t for parameters that represent a length.
    IMHO a signed type is fine, I just wanted to point that out.
  • The stub of decode() uses the term "buffer" for the input.
    Personally I'd prefer something like bytes because in my experience "buffer" is most commonly used for output.
  • All tests are enabled, right from the start.
    Most exercises have only the first tests enabled, the other tests are disabled with the TEST_IGNORE(); macro, and the first of those lines is followed by the comment "// delete this line to run test" (e.g. in the knapsack exercise.)
    I'd suggest doing that in this exercise, too.
  • The tests use decimal integer literals and call TEST_ASSERT_EQUAL_UINT8_ARRAY and TEST_ASSERT_EQUAL_UINT64_ARRAY to compare the actual and expected result.
    A comment in the tests file recommends displaying numbers as hex literals. IMHO that applies to the literals in the test file and the output of the tests.
    I'd suggest using hexadecimal integer literals in the tests and calling TEST_ASSERT_EQUAL_HEX8_ARRAY and TEST_ASSERT_EQUAL_HEX64_ARRAY (see the documentation).
  • In the example solution there's a section for decoding SQLite_VARINT with a maximal length of 9 bytes and a special treatment of the 9th byte.
    I would really prefer not to include that. It's not relevant for this exercise, it makes the code longer, it increases the amount of code that has to be maintained, and we would need additional tests just for that section to ensure the code is correct.

Otherwise this looks good!


The example.c may no be very reader-friendly. Is it expected to be read by someone or is it just to test the tests?

AFAIK we haven't discussed that yet for the C track. Its main purpose is to prove that the exercise can be solved. But in my opinion it should be a "good" solution, that's correct, somewhat easy to read, idiomatic, and reasonably efficient.

I'm using uint64_t for the decoded integers, but it's not expected to have integers bigger than 32-bits. Is it preferable to use 32 bit integers?

Personally I would prefer 32-bit integers, I think that would match the instructions a little bit better ("* for this exercise we will restrict ourselves to only numbers that fit in a 32-bit unsigned integer*"). But I can live with 64-bit integers.

But (disclaimer) I'm not a maintainer of the C track, I have no vote here.


Just to spare the maintainers some time: The makefile and test framework are identical to the other exercises. The .docs and tests are complete. I was able to solve the exercise using the provided files. I have not found any issues or problems other than what I wrote above.

@wolf99
Copy link
Contributor

wolf99 commented May 24, 2025

Thank you for submitting @Wiguwbe .

Good review @siebenschlaefer .

- extend header guards;
- explain better the functions in the headers;
- use `const` for array inputs;
- use `size_t` for array lengths;
- rename input param `buffer` -> `bytes`;
- disable most tests (enable only first of each test group);
- use hex assert variants;
- remove SQLite section;
- use 32 bits integers (vs 64).
@Wiguwbe
Copy link
Contributor Author

Wiguwbe commented May 24, 2025

Many thanks @siebenschlaefer for the very good review indeed.

I believe I have addressed all of the items. I kept it on a separate commit for ease of reading, I can squash it into the previous one if it preferable later. Let me know if the changes satisfy.

(...) it should be a "good" solution, that's correct, somewhat easy to read, idiomatic, and reasonably efficient.

I never really did any C code professionally, or have it reviewed, so I'm not sure what "good"/idiomatic/easy to read code looks like 😄 . I'll see about improving tomorrow, hopefully I won't make it worse, but if it's not a blocker I'd say to not wait for me.

making it more clear that encoding uses a LIFO
@ryanplusplus
Copy link
Member

Thanks!

@ryanplusplus ryanplusplus merged commit a6cd290 into exercism:main May 25, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants