Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sciform review #121

Open
14 of 30 tasks
jagerber48 opened this issue Jul 21, 2023 · 69 comments
Open
14 of 30 tasks

sciform review #121

jagerber48 opened this issue Jul 21, 2023 · 69 comments

Comments

@jagerber48
Copy link

jagerber48 commented Jul 21, 2023

Submitting Author: Justin Gerber(@jagerber48)
All current maintainers: (@jagerber48)
Package Name: Sciform
One-Line Description of Package: A package for converting python numbers (floats, Decimals) into scientific-formatted strings more suitable for reading and presentation.
Repository Link: https://github.com/jagerber48/sciform
Version submitted: 0.24.0
Editor: @Batalex
Reviewer 1: @isabelizimm
Reviewer 2: @machow
Archive: TBD
Version accepted: 0.33.0
Date accepted (month/day/year): 02/07/2024


Code of Conduct & Commitment to Maintain Package

Description

  • Include a brief paragraph describing what your package does:
    sciform is used to convert python float objects into strings according to a variety of user-selected scientific formatting options including fixed-point and decimal and binary scientific and engineering notations. Where possible, formatting follows documented standards such as those published by BIPM or IEC. sciform provides certain options, such as engineering notation, well-controlled significant figure rounding, and separator customization which are not provided by the python built-in format specification mini-language (FSML) for formatting floats into strings. In addition, sciform provides functionality for formatting pairs of floats as value +/- uncertainty pairs according to a variety of scientific standards.

Scope

  • Please indicate which category or categories.
    Check out our package scope page to learn more about our
    scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):

    • Data retrieval
    • Data extraction
    • Data processing/munging
    • Data deposition
    • Data validation and testing
    • Data visualization1
    • Workflow automation
    • Citation management and bibliometrics
    • Scientific software wrappers
    • Database interoperability

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo

Community Partnerships

If your package is associated with an
existing community please check below:

  • For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):

    • Who is the target audience and what are scientific applications of this package?
      The target audience for this package includes any scientist trying to print or otherwise display numerical data in a way that is easily readable and conforms to various scientific standards. Numerical data that users may be interested may be numbers (possibly with uncertainties) coming from literature (such as scientific constants), numbers that tabulate raw measurement results or numbers resulting from calculation analyses (such as best-fit algorithms applied to data). Even numbers appearing in plot tick labels can be formatted using sciform. See https://sciform.readthedocs.io/en/stable/examples.html for some example use cases.

    • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

      1. Python built-in string formatting mini language (https://docs.python.org/3/library/string.html#format-specification-mini-language). sciform includes its own string formatting mini language closely based on the built in one, but with some differences. Notably sciform includes well-controlled significant figure formatting, engineering notation, binary formatting, SI/IEC prefix substitution, digit grouping and decimal symbol options (helpful for a diversity of locales), exponent value coercion, as well as value +/- uncertainty formatting functionality.
      2. The uncertainties package (https://pythonhosted.org/uncertainties/). sciform was heavily motivated by this package. This package has sophisticated statistical handling of value +/- uncertainty pairs, handling error propagation and simulation under-the-hood. In addition, it has its own extension of the mini language for formatting value +/- uncertainty pairs. sciform has more formatting functionality than the uncertainties package including, especially, engineering notation, grouping separator controls, and prefix substitution. sciform is also a much lighter weight requirement than the uncertainties package. This may be desirable when a user wants to format strings, but they don't need the rest of the full statistical machinery of the uncertainties package.
      3. The prefixed package (https://github.com/Rockhopper-Technologies/prefixed). sciform was also motivated by the prefixed package. This package provides a sort of engineering notation where exponents are rounded to multiples of 3, and then exponents area always replaced with their corresponding SI exponent. prefixed package is a more conservative extension of the built-in formatting language. sciform includes more functionality including engineering notation without prefix substitution and more grouping/decimal symbol control. sciform also includes global configuration options for handling optional SI prefixes such as c, d, da, and h.
      4. The sigfig package (https://sigfig.readthedocs.io/en/latest/). The sigfig package has similar functionality to sciform including sig fig rounding, separator control, value +/- uncertainty formatting including some features that are only forthcoming in sciform. sig fig does not currently support binary formatting. sig fig also does not provide a format specification mini language for formatting floats. Rather floats are formatted using an overload of the built-in round function which I find to be slightly awkward compared to a Formatter object or function.
    • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:
      Presubmission Inquiry for sciform (float -> string scientific formatting) #114 @NickleDave

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • uses an OSI approved license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a tutorial with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

Publication Options

JOSS Checks
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI:

Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

  • I have read the author guide.
  • I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

Footnotes

  1. Please fill out a pre-submission inquiry before submitting a data visualization package.

@NickleDave
Copy link
Contributor

Linking back to discussion on the presubmission issue:
#114 (comment)

We will hold off on starting this review for just a bit (two weeks maximum) while we help @jagerber48 with some questions about development and scope.

@jagerber48
Copy link
Author

I have come to decisions on most of the questions I had that @NickleDave referred to in the previous comment. For one question I opened a topic in the pyopensci discourse: https://pyopensci.discourse.group/t/how-to-avoid-repeating-a-long-list-of-keyword-options-throughout-a-package/331/5. The suggestions there are very good and I plan to explore them in upcoming tests and versions for sciform. The suggestion I will likely take will result in changes to the user-facing API and hopefully a large cleanup of the codebase.

@NickleDave how do you suggest we proceed with the review? I'm in an active phase of development so I'm not sure what makes sense in terms of hitting a moving target. I just made an important update in version 0.22.0 involving changing from float-based formatting on the backend to Decimal-based formatting. For now I will set 0.22.2 as the version submitted. I anticipate version 0.23.0 will incorporate changes to avoid the code duplication suggested in the discourse topic, and I said above, this will be a pretty major modification to the codebase and a change to the user API. I imagine I'll be able to implement these changes in the next month, probably the next two weeks. Given that timeline, what makes sense for the review timeline? Also, in this post you suggested I could release these changes as a release candidate version. Were you imagining that release candidate might appear before the review, during the review, or after? Would the release candidate be the main target of the review or, e.g. 0.22.2 without the new changes?

I don't have any rush to get this review going if we want to wait for more recent code on the one hand. But on the other hand, if I keep changing the code and not starting the review we'll never hit the moving target..

@NickleDave
Copy link
Contributor

Hi @jagerber48 thank you for updating here.

You are right that we should not let this get to a place where you keep changing the code.
But I think the best thing to do is wait until you incorporate changes suggested in the Discourse topic, then start right away.

Please release that version, which you expect to be 0.23.0, and then set that as the version submitted above.

@jagerber48
Copy link
Author

Thanks. That makes the most sense to me also. That's what I'll do.

@NickleDave
Copy link
Contributor

Great, thank you @jagerber48 -- once you reply back that you've released that version, we'll go ahead with the review

@jagerber48
Copy link
Author

Ok. I've implemented a version of the changes discussed in the discourse topic. I'll post details in the topic since progress has been made but there may still be room for improvement. sciform is on version 0.24.0 for now and I think review can move forward. I have a few small-scale (I think) questions that I'm curious to have addressed/considered during the review. I'll compile and post them here.

I may continue to make small code/docs changes and releases during the review but I'm happy to pin the review to version 0.24.0. Alternatively, if it would be better for me to use some other git workflow during the review I'm happy to do that. One of my questions is generally what my git workflow should be for this package. Right now I just have a main branch and I make feature branches that have one main feature but sometimes a small feature or two sneaks in. I then just PR merge that feature branch into the main branch. I don't really have much experience releasing versions of code so I'm just going off some stuff I've learned. Not experience with release branches, pre-releases, etc. (Happy to discuss in more detail later, just want to answer now if I should do anything differently immediately during the review time).

@jagerber48
Copy link
Author

I pushed version 0.25.0 which follows a more updated version of what was discussed in the thread for reducing code repetition. If the review has already started or anything happy to keep the review looking at 0.24.0.

Some general questions I have that may be in-scope for the review:

  • Suggested git branching model, or modifications to my existing model. Right now I branch off of main for a feature branch, I work on that, then I PR it into main using github. However, I often sneak 1 or 2 additional features in addition to the main feature into these feature branches. Is this a bad practice? What should I do instead?
  • Should I include changelog even for minor version bumps?
  • Right now when I merge a PR I have to manually tag the branch on my local repo, push up the tag, then build the code into a sdist/whl and then upload to pypi. I guess I should automate these steps? Are there example for good ways to do this?
  • Right now on my github PRs I have github actions to run tests and linting on different python versions. Is there a way for me to run these tests all on my local system with github? Or is this type of automation typically done with some third party build/automation tool?

Then I'll also mention that I've found making the documentation good and consistent has been my biggest challenge/time spent on this project. I continue to have ideas for documentation improvements but documentation has been a moving target as the code has continued to change. Just to say: I expect errors in the documentation and places I can make improvements.

@jagerber48
Copy link
Author

I don't have any non-trivial breaking changes in mind for the package at this time (though stuff always seems to come up). The only API changes I have in mind are possible name changes to options or classes. There are a few I'm not 100% happy with so I may poll pyopensci or e.g. code review stack exchange to try to get some broader set of opinions on some naming choices.

Just two I've written down:

  • precision is overloaded to mean (1) the number of digits after the decimal point to round to in RoundMode.PREC mode and (2) the number of sig figs to round to in RoundMode.SIG_FIG. It isn't really the best name for either of these. Maybe something like num_round_digits?
  • SciNum and SciNumUnc used to be sfloat and vufloat before the code was refactored to use Decimal over float. I'm not really in deep love with any of these names.

Those are just two I had written down, but pretty much any name in the API is fair game for naming improvement suggestions.

@NickleDave
Copy link
Contributor

Thank you for letting us know @jagerber48.
We are wrapping up a couple of other reviews but we expect to have an editor free to start this one shortly.
They will make sure reviewers take your requests for feedback into consideration.

@jagerber48
Copy link
Author

sciform is now on version 0.26.2. Version 0.26.0 introduces a number of trivial name changes to various options. 0.26.1 introduced more unit tests for better coverage (I learned how to measure unit test coverage) and helped me discover one bug and another bug/unexpected behavior. The first bug is fixed in 0.26.2. The second bug requires some convention decisions to be made about the format specification mini language. See jagerber48/sciform#29.

@Batalex
Copy link
Contributor

Batalex commented Sep 16, 2023

Hi @jagerber48,

As we discussed, I will lead the review for sciform.
I am looking for reviewers and will let you know when I am done.

@jagerber48
Copy link
Author

Here's an update on the current status of the package and some questions. Since making this submission I modified the code quite a bit moving up to version 0.29.0. Since then I've taken a break from updating sciform because I think all of the core functionality I envisioned is in place. The tasks I see (I think it's likely other will arise during the review) in front of me are

  • Trying to fully stabilize the API. I think it's close but there are few "opinion-based" questions I have about e.g. the naming of some options and similar scope. See below.
  • Adding non-core features. Importantly there is one important feature I want to add which is pre-defined FormatOptions and corresponding Formatter classes. Hopefully these can exist and make it so that users rarely if ever have to actually interface with configuring their own options and can just select a pre-configured formatter and use that. Or maybe set pre-configured global default options and use a generic formatter.

For adding the non-core features I plan to wait until after the pyopensci review and possibly until after I release version 1.0.0 with a more stable API (which will also happen after the pyopensci review).
For stabilizing the API since I feel a lot of the question I have are opinion-based I'm excited to get opinions from other people than me! I'd hope that some of these questions can be debated or discussed during the review if they're in scope. Here are the current questions I have listed:

  • Enums. Right now when configuring a FormatOptions object the user has to import various enum objects like the ExpMode or ExpFormat enum to configure certain settings. Is it too much of a pain to have to import all of these objects to choose settings? What would be a better alternative? (I know bare strings could be used but these are error-prone...). Also in some cases a sentinel like AutoDigits or AutoExpVal needs to be imported. (@Batalex already suggested using Literal. There's also the possibility of a StrEnum which might support passing an enum instance or the string representation..)
  • Formatter and FormatOptions. Right now creating a Formatter object is a two step process which first involves creating FormatOptions object then passing it to the Formatter constructor. This can be done in one line but takes up a lot of characters. The FormatOptions can't be dispensed with because it is also used to conveniently configure globa[l options and new sets of format options. I considered allowing passing the kwargs to form FormatOptions directly to the Formatter constructor, but there were various downsides to that approach because of just how many options there are. See this discussion for further details. Is this approach to burdensome? How could it be improved (taking into account all of the points made in the linked discussion).
  • By default value/uncertainty pairs are formatted as 123 +/- 42. But using the unicode_pm=True option it is possible to format this as 123 ± 42. Should the unicode_pm option be True by default? Or even further, should unicode_pm=True just be the only behavior and the option is dispensed with entirely??
  • I'm not in love with some of the names chosen for various objects and options. Looking for suggestions for improvements on naming ANYWHERE throughout the package (even possibly the package name). But a few that I've singled out are:
    • Is top_dig_place an ok name?
    • Is superscript_exp an ok name? Maybe just superscript?
    • Is bracket_unc an ok name?
    • Is SciNum an ok name?
    • Is SciNumUnc an ok name?
  • Originally I wasn't planning support for binary formatting modes but I was inspired to do so by the prefixed package. Right now binary formatting modes are accessed via dedicated exponent modes. I wonder if the two binary modes should instead be accessed via a separate option indicating the exponent base (10 or 2 at the moment). I'm also curious to query interest levels in the binary formatting. I personally have no interest in the binary formatting since my work is mostly in physical sciences and I use decimals and SI. But I imagine people more in computer sciences may have more need for binary formatting.
  • Right now there's an issue with the sciform FSML where the separator options and rounding mode option collide. See the github issue. This issue came about from when i was trying to include all options in the FSML. But since then I've built out FormatOptions much more, setup global configuration options, and abandoned the idea that all options should be configurable via the mini language. I'm leaning towards just dropping the ability to configure separators from the FSML. Objections to this approach?

These are some of the main existing api features that I have questions about. I.e. making decisions on the above will constitute breaking changes, so I want to address these before 1.0.0. I am ALSO open to feature requests but will be implementing those with lower priority than stabilizing the API.

Thank you so much for taking the time to read this and also thank you very much for any feedback you are able to provide! I'm also curious to hear suggestions for other communities where I could seek feedback.

@Batalex
Copy link
Contributor

Batalex commented Oct 5, 2023

👋 Hi @isabelizimm and @machow! Thank you for volunteering to review for pyOpenSci. I am truly excited by the team we have on this review.

@jagerber48, Isabel and Michael are involved with quarto, and I believe they have some pretty good opinions on how to format numbers for publishing. I am looking forward to seeing their review!

Please fill out our pre-review survey

Before beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.

  • reviewer 1 survey completed.
  • reviewer 2 survey completed.

The following resources will help you complete your review:

  1. Here is the reviewers guide. This guide contains all of the steps and information needed to complete your review.
  2. Here is the review template that you will need to fill out and submit
    here as a comment, once your review is complete.

Please get in touch with any questions or concerns! Your review is due: 27 October 2023.

Reviewers: @isabelizimm, @machow
Due date: 2023/10/27

@jagerber48
Copy link
Author

Started running the code on real applications myself some more and quickly found some important bugs. Just made a release of version 0.29.1 that fixes these bugs.
https://github.com/jagerber48/sciform/releases/tag/0.29.1

@isabelizimm
Copy link
Contributor

isabelizimm commented Oct 25, 2023

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README.
  • Installation instructions: for the development version of the package and any non-standard dependencies in README.
  • Vignette(s) demonstrating major functionality that runs successfully locally.

Has examples page

  • Function Documentation: for all user-facing functions.
  • Examples for all user-facing functions.
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name
  • Badges for:
    • Continuous integration and test coverage,
    • Docs building (if you have a documentation website),
    • A repostatus.org badge,
    • Python versions supported,
    • Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

  • Short description of package goals.
  • Package installation instructions
  • Any additional setup required to use the package (authentication tokens, etc.) (N/A for sciform)
  • Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    • Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
  • Link to your documentation website.
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider whether:

  • Package documentation is clear and easy to find and use
  • The need for the package is clear
  • All functions have documentation and associated examples for use
  • The package is easy to install

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed. No specific performance claims have been made.
  • Automated tests: on GitHub Actions
    • All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
    • Tests cover essential functions of the package and a reasonable range of inputs and conditions. Has 100% test coverage! 🎉
  • Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
    A few notable highlights to look at:
    • Package supports modern versions of Python and not End of life versions.
    • Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 4-5


Review Comments

Overall, this package is super nifty and clearly offers a huge quality of life increase in scientific publishing. The documentation, as a whole, is pretty comprehensive with lots of examples and a mix of API and narrative style docs. Thank you for a great addition to pyOpenSci!

Response to a few maintainer questions

I've answered a handful of your questions that haven't had a response yet + I have a response to, but happy to have discussion on any other points you're interested in chatting about! 😄

By default value/uncertainty pairs are formatted as 123 +/- 42. But using the unicode_pm=True option it is possible to format this as 123 ± 42. Should the unicode_pm option be True by default? Or even further, should unicode_pm=True just be the only behavior and the option is dispensed with entirely??

I think setting unicode_pm=True as a default is the right move! I think it is a delight sciform offers to users to not have to type ± 😄 But, there are a lot of configuration parameters and it could be missed, so this makes the out-of-the-box experience feel a little extra polished.

Is top_dig_place an ok name?

I wonder if you could somehow work padding into the name? My thought process behind that: top_dig_place looks to be used in the general purpose Formatter where arguments have a variety of use cases (padding, decimal places, separators), so it's use for padding might not be apparent. (These are not very good suggestions, but something like pad_to or pad_to_exp?)

Is superscript_exp an ok name? Maybe just superscript?

I have a slight preference to superscript.

Is bracket_unc an ok name?

For bracket_unc and SciNumUnc alike, my brain did not go to uncertainty for unc. I do realize that a other arguments build off this name (bracket_unc_remove_seps), so brevity is important. (bracket_uncertain + bracket_uncertain_seps maybe? you would have to flip the boolean for the separators then since you drop the negative)

Is SciNum an ok name?

I like SciNum.

Is SciNumUnc an ok name?

It was not immediately apparent to me that SciNumUnc was for uncertainty, but once I made that connection, it made a lot of sense. SciNumUncertain is a bit longer, but more clear. I do think Uncertainty alone isn't verbose enough since you still need your scientific number. FWIW I ended up typing SciNumU and then auto-completing the rest, so I'm not too afraid of longer names.

I wonder if the two binary modes should instead be accessed via a separate option indicating the exponent base (10 or 2 at the moment). I'm also curious to query interest levels in the binary formatting. I personally have no interest in the binary formatting since my work is mostly in physical sciences and I use decimals and SI.

For binary formatting, maybe open an issue/discussion about it and ask people to 👍 if they would like to see it implemented? That way you're not building things you don't have interest in. However, if you already have the plumbing in place, and it's just a matter of exposing an argument, and you feel compelled to do so, then that seems like a good interrim move!

Should I include changelog even for minor version bumps?

I think yes! For every release, it is super useful to see the changes. If it feels like too much effort to go back and write everything manually, even something small like using the Generate Release Notes button when creating a new release is sufficient. 😄

Right now on my github PRs I have github actions to run tests and linting on different python versions. Is there a way for me to run these tests all on my local system with github? Or is this type of automation typically done with some third party build/automation tool?

Something you can do locally is install and use pre-commit. You can have pre-commit hooks do a variety of things, but some common use cases are running black and flake. You can also configure testing to run on pre-commit, but this is less common to do since sometimes tests can take a while to run.

Optional nits:
  • To make documentation even easier to find, you can add as a url in the settings to show up here ⬇️ (you should see a ⚙️)
sciform-url
  • It would be useful to give install instructions for a development version of the package (eg, pip install git+https://github.com/jagerber48/sciform for the latest changes)
  • For more readable code, you might want to add black or some other formatting standard. To help automate this, running the linter can be added in a pre-commit hook or in CI.

@Batalex
Copy link
Contributor

Batalex commented Oct 27, 2023

Thank you kindly for this thorough review!

@machow
Copy link

machow commented Oct 28, 2023

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README.
  • [] Installation instructions: for the development version of the package and any non-standard dependencies in README.
  • Vignette(s) demonstrating major functionality that runs successfully locally.
  • Function Documentation: for all user-facing functions.
  • Examples for all user-facing functions.
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

📝 NOTE that the package uses a README.rst, but maybe this meets the requirement?

The README should include, from top to bottom:

  • The package name
  • Badges for:
    • Continuous integration and test coverage,
    • Docs building (if you have a documentation website),
    • A repostatus.org badge,
    • Python versions supported,
    • Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

  • Short description of package goals.
  • Package installation instructions
  • Any additional setup required to use the package (authentication tokens, etc.)
  • Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    • Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
  • Link to your documentation website.
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
  • Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
Package structure should follow general community best-practices. In general please consider whether:

  • Package documentation is clear and easy to find and use.
  • The need for the package is clear
  • All functions have documentation and associated examples for use
  • The package is easy to install

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests:
    • All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
    • Tests cover essential functions of the package and a reasonable range of inputs and conditions.
  • Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.
    A few notable highlights to look at:
    • Package supports modern versions of Python and not End of life versions.
    • Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 2.5


Review Comments

Suggested git branching model, or modifications to my existing model. Right now I branch off of main for a feature branch, I work on that, then I PR it into main using github. However, I often sneak 1 or 2 additional features in addition to the main feature into these feature branches. Is this a bad practice? What should I do instead?

I think that--as the package maintainer--if bundling multiple features into a PR is working for you, then it seems okay. It seems more important to limit the scope of PRs to other people's repositories, so that 1 feature doesn't get held up waiting for feedback / discussion on others.

Right now when I merge a PR I have to manually tag the branch on my local repo, push up the tag, then build the code into a sdist/whl and then upload to pypi. I guess I should automate these steps? Are there example for good ways to do this?

You can use the github action https://github.com/pypa/gh-action-pypi-publish to publish to PyPI whenever you make a github release. Here's an example in a tool I maintain called quartodoc, in case it's useful!

Right now on my github PRs I have github actions to run tests and linting on different python versions. Is there a way for me to run these tests all on my local system with github? Or is this type of automation typically done with some third party build/automation tool?

You can use tox, though I just push to github and let the github actions run the tests. There's also act, for running github actions locally (though I've never used it).

SciNum and SciNumUnc used to be sfloat and vufloat before the code was refactored to use Decimal over float. I'm not really in deep love with any of these names.

I have no good opinions on the exact names, but like that they use camel case, so feel more class-y.

Adding non-core features. Importantly there is one important feature I want to add which is pre-defined FormatOptions and corresponding Formatter classes.

This seems super handy!

Enums. Right now when configuring a FormatOptions object the user has to import various enum objects like the ExpMode or ExpFormat enum to configure certain settings. Is it too much of a pain to have to import all of these objects to choose settings? What would be a better alternative?

Tools like pydantic can take Literals or Enums, and give nice validation / feedback if an invalid value is passed. If pydantic feels too heavy duty, I like the suggestion of FormatOptions using Literal, and doing some validation. At a certain point, many options mapping to their own Enums starts to feel a bit cumbersome to type.

Formatter and FormatOptions. Right now creating a Formatter object is a two step process which first involves creating FormatOptions object then passing it to the Formatter constructor.

If Formatter can only be initialized using a single FormatOptions instance--and FormatOptions is all data no methods--then, it seems like they can be a single class. (It seems like the combined class of these two things could still be used to configure the global options).

By default value/uncertainty pairs are formatted as 123 +/- 42 ... Should the unicode_pm option be True by default?

I'm not in love with some of the names chosen for various objects and options.

Originally I wasn't planning support for binary formatting modes but

Not helpful opinions here :/. I think that the package is so useful for this kind of problem that I'd be motivated as a user to figure out what all the names meant, though!

Right now there's an issue with the sciform FSML where the separator options... I'm leaning towards just dropping the ability to configure separators from the FSML. Objections to this approach?

The FSML seems really neat, but reading through the documentation / trying the package, it feels easier to specify FormatOptions (and with literals or some approach that makes seeing options in autocomplete / code completion in eg VS Code would be very fast to initialize!). I always have to look up the options in the built-in string formatting mini-language, but initializing a class gives a lot of nice hinting / auto complete options for helping folks.

(I could see the FSML being nice for quick, common formatting though!)

Overall, it seems like a really helpful library!

@Batalex
Copy link
Contributor

Batalex commented Oct 28, 2023

Thanks a bunch, @isabelizimm and @machow, for the reviews!

@jagerber48 I also took a few notes (without the suitable format, editor's privilege 🐈‍⬛ )

Project documentation

README

Instructions are clear, and the statement of need as well.
Lacking a few elements from our guide, nothing too serious.
Python version compatibility needs to be clarified.

Most of those issues have been solved in the meantime.

Contributing

Missing as of now.

License

Ok.

Changelog

Ok.

Installation

Installation resolves correctly.

pyproject.toml

The classifiers are lacking (python version).
Nice setup for the dynamic version & readme.

More classifiers in recent releases.

General remarks and QoC

  • I would advise using a code formatter, such as black.

  • I would also strongly advise sorting imports using isort or ruff (❤️ ).
    (Maybe isort since the linter used is flake8, but I sure won't stop you from migrating to Ruff)

  • Is there any reason for the comments to start with #:?

  • A few type hints are wrong; a parameter cannot take a default value of None if its type does not allow it.

  • There are a lot of exceptions raised throughout the code base.
    We could simplify it by removing those checks in private functions.

modes.py

  • FillMode: the exception raised is unneeded. Since enums can only take a finite set of values, we cannot have anything other than space or zero.
  • GroupingSeparator: same as above.

I am on the fence concerning using Literal type hints with the values.

format_utils

  • Maybe use a type alias for Number
  • A few type hints are missing (get_top_digit)
  • get_mantissa_exp_base seems a little too complex. Maybe it can be simplified.
  • get_round_digit. The two innermost conditions lead to the same result.

API design

I want to propose something regarding the use of the sciform. Currently, the return types for the format functions are strings, which are acceptable for the project's current goal.
I would like to propose making a FormattedValue class inheriting from str.
The reason is that we could then include methods such as _repr_html_ to use html syntax in notebooks, or other mime types (LaTeX export to use with quarto?).

@jagerber48
Copy link
Author

@isabelizimm @machow @Batalex thank you so much for the time you've taken to review this package and all the great feedback you've given me! I'll spend some time looking at this feedback and implementing the changes. Some of the changes seem easy, but some of them may require structural considerations. My first step is going to be to try to list which changes I plan to implement and how. I'll likely post that here before I get to work on them.

@Batalex how does the review work from this point forward? Are there any timelines or anything I should be aware of?

@jagerber48
Copy link
Author

jagerber48 commented Dec 11, 2023

Ok, thanks for the response. In that case, the list is indeed getting short. I'm going to collect the remaining items I have

  • Annotated FSML examples. These would be useful. They'll take some time to get in. I think the separator configurations, and the exponent value forcing settings are probably the most new/confusing.
  • Rename top_dig_place to something like left_pad_digit or left_pad_to_digit
  • Rename superscript_exp to superscript
  • Think about clarity of the unc abbreviation. Consider revising the interface so that options specific to value/uncertainty formatting sit in a separate place than the rest of the options. This might allow for some improved options names
  • Think about SciNum/SciNumUnc naming.
  • Consider using precommit hooks for linting/formatting/tests (in lieu of running github actions locally)
  • Add development install instructions. @isabelizimm, you were the one who suggested I end develop install instructions, curious for your and others' thoughts on the following. I'm curious what is typical here. My first guess would be to recommend devs fork and git clone the repo then pip install -r requirements-dev.txt or something. I know I could configure pyproject.toml to allow python -m pip install sciform[dev] to work. But I think I'd prefer the former approach so they already have it set up as a git repo and are able to push and make pull requests. And then what should dev requirements be? sciform itself has no requirements for installation and use. For simple dev I guess devs would want ruff installed. For editing docs I need a couple sphinx and sphinx utility packages. For the examples I need the heavyweight numpy/scipy/matplotlib requirements. Should I include ALL of these as dev requirements? Should I have separate requirements files for each of these different dev use-cases? I'm leaning towards this latter approach.
  • Research tox or act for running github actions locally. I think the main challenge here is spinning up environments that use all the different versions of python I test against the way github actions do. But I wonder if this is what act can do. Looks like act uses docker which sounds like a pain on Windows. Maybe I could run act on a Linux VM or something. But I think I'll likely conclude this is not worth the effort. I can just use github action on github. The worst case is I end up with some throwaway commits in PRs whose only purpose was to test some github action behavior.
  • Consider the FormattedValue suggestion to support converting formatted values to a number of output formats including, e.g. string literal, Latex formatted string, html formatted string.

@isabelizimm
Copy link
Contributor

isabelizimm commented Dec 12, 2023

These changes and updates look great! If you would like second eyes on any changes going in, feel free to tag me in PRs. I tend to be best summoned by an @ 😄

Re-development instructions:

Minimum: add a line in installation to show people how to get latest version, eg

pip install git+https://github.com/jagerber48/sciform.git

This will show people how to get latest development changes, even if they're not released yet. (example)

Slightly more effort: some documentation, either in a README or CONTRIBUTING file to show mainly 1) how to get to a local install, 2) how to build docs and 3) how to run tests. One example is the CONTRIBUTING.md guide in the package pins..

How exactly you want to configure those is up to you. My experience is that requirements.txt files tend to get stale, so I generally lean towards sciform[dev] via pyproject.toml route. However, the sciform package itself doesn't have any dependencies, so I think that using requirements files is a perfectly valid move in this scenario if it works better for you! If you want to make it different requirements files, splitting it out to requirements-test/requirements-docs makes sense, but probably no more modular than that.

@jagerber48
Copy link
Author

@isabelizimm Thanks for all that info. Ok, it sounds like your main suggestion is to include instructions for how people can install the latest development/pre-release version of the code to e.g. test it out. That is easy to do. The other part that we've been discussing is to include instructions for people who want to develop sciform. These two things have different requirements.

In any case, I have a PR with both of them I would appreciate you having a look at if you're able to: jagerber48/sciform#90. I think it's relatively straight forward.

I decided to go with the pip install -e .[dev] approach, and also to not break things down any further than this. The numpy/scipy/matplotlib dev requirements really aren't that burdensome, and so few people are going to be developing sciform I think it's just not worth any added overhead.

@jagerber48
Copy link
Author

Unless there are strong opinions, I don't want to setup pre-commit hooks or tox/act to do any sort of CI locally. I'm fine just letting it run in the PR and doing the local steps manually. In the future I may set up pre-commit hooks to run tests and do formatting as I become more comfortable with the tools. The main downside I see to setting up the pre-commit hooks is that it seems like it would be challenging to submit a commit that doesn't perfectly conform to everything. I could see this being annoying if I'm trying to set up some quick and dirty code that might fail some tests, or that I don't want to bother to format perfectly and I want to commit it to keep it around. Maybe there's an easy way to bypass pre-commit hooks for these sorts of situations?

@jagerber48
Copy link
Author

I'm going to rename top_dig_place to left_pad_dec_place. This is consistent with a renaming of the rounding mode that rounds by "decimal place" (similar to "precision" for the python built-in FSML) to dec_place. "Digits place" is not a standard terminology for this concept, "decimal place" is the standard terminology.

Here's an idea on SciNum and SciNumUnc. I think SciNum is an ok name for a number that is going to be converted to a formatted string. I don't like SciNumUnc though. I think I could remove the SciNumUnc object and just let the SciNum __init__ method take an optional uncertainty input parameter and then perform the formatting accordingly.

@isabelizimm
Copy link
Contributor

re: development docs: I reviewed that PR, a few small comments but looks good to me! It does really help users to get started contributing when they don't have to guess about some of the going-though-the-motions pieces of development, so I think this is a great addition. Plus, everything ran on my laptop right out of the box, which is always exciting➕ 😄

re: CI/pre-commit: I personally don't use act/tox/etc. I've tried a few, but no project of mine has been heavyweight enough that I found it a net-win to my workflow. The one thing I do use, though, is pre-commit hooks. It might be nice to start using this tool locally to make sure all new code is compliant, and integrate it into CI as you see fit.

A few pre-commit moves that might make it more useful to you:

  • You can pass pre-commit --no-verify to bypass these hooks, if needed
  • You can set up CI for pre-commit, or just have it part of a local workflow (still super useful, imo)
  • If you check in the pre-commit hooks into a repository, it can be part of the development install/docs, just like how you have ruff currently. Because pre-commit runs, well, pre-commit, people won't be able to make non-compliant commits.

re: naming: These both seem like good moves! The digits place -> decimal place move immediately clicked in my brain, and having uncertainty as a parameter seems reasonable enough to me.

@jagerber48
Copy link
Author

@isabelizimm ok yes, pre-commit --no-verify was the trick to make me comfortable setting up pre-commit and adding pre-commit instructions. Fortunately git kraken (the tool I use for git) has an easy button to do the --no-verify option, so this is all very convenient for me.

Ok I'm working on another release (0.31.0) that addresses all of the remaining concerns in the review except the lazy formatting FormattedValue suggestion. Here's the PR for the release jagerber48/sciform#96. There are a number of notable changes you can find in the changelog, but the highlights are:

  • Remove SciNumUnc class, now SciNum class accepts optional second input for value/uncertainty formatted
  • No longer possible to configure separators using FSML, this just made the FSML too complicated. This made it easier to write annotated FSML examples.
  • Many options were renamed. Notably the unintuitive and ugly unc abbreviation has been eliminated. See changelog for all the changes.
  • Linted/reformatted code using ruff. This makes the diff huge

Anyways, if any of you are inclined, @Batalex @machow @isabelizimm I would appreciate if any of you have comments on this jump from 0.30.1 to 0.31.0 before I release it. Otherwise, I'm curious about what next steps on the review look like.

There are a couple improvements I see that can be made, but I consider them to be typical package improvements that should happen over time and I don't know that they have bearing on the review.

@Batalex
Copy link
Contributor

Batalex commented Jan 2, 2024

IMO we are at the end of the review process. Thank you kindly for bearing with our demands!

Before I ask @machow and @isabelizimm for their final approval, I would like to ask you if you feel that the API is stable enough so that you can release 1.0 in the same time frame as the pyopensci "seal of quality"?

This is not an obligation by any mean, of course.

@jagerber48
Copy link
Author

@Batalex great! Thank you!

I've been asking myself the 1.0.0 question lately as well. It's definitely still my goal to get 1.0.0 released. This pyopensci review has been the main thing I've been trying to check off on my release 1.0.0 list, but now that this is wrapping up I need to step back and think about if there is anything else.

  • There are definitely a couple minor changes to behavior I have planned, but I'm not sure if these need to block 1.0.0, I need to list these out.
  • The deferred formatting for html and latex may result in removing the latex option. That makes me tempted to remove the latex option from the user-facing interface for now so that there's not a regression when that feature is deprecated in favor of the deferred formatting.
  • I need to get myself a little bit more convinced that the names of everything are good. I'm feeling better about it after this last round of renamings.

@jagerber48
Copy link
Author

I just released version 0.31.1 which implements all of the changes discussed above.

My current action is compiling my sciform to do list from a few different sources, then I'll know what steps are still needed for 1.0.0.

@jagerber48
Copy link
Author

Ok, I got my thoughts collected. Previously I had to do lists in various places. Now most of the todos are captured as tagged (labeled) issues. Most of the issues get the Status: To Investigate label which means I need to think about whether I want to do anything about that issue or not.

However, two of the issues got tagged for the Release 1.0.0 milestone. They both correspond to what I would consider to be either strange or broken behavior for a couple of options. I'll just link the issues here. I would appreciate any thoughts on the issues that anyone here has.

The latter one especially might result in, e.g. removing the global settings functions like global_add_c_prefix.

The resolution to this issue

May also constitute a small breaking change to the behavior.

Regarding naming: I'm pretty happy with where things are at at the moment. There's maybe a couple names I don't love (e.g. extra_parts_per_forms) but I think the really ugly ones have been cleaned up.

I do have some internal debates about what 1.0.0 should mean for sciform. I can imagine wanting to make small breaking changes, such as jagerber48/sciform#99 after 1.0.0 is released. Do I really need to bump from 1.0.0 to 2.0.0 because the formatting output is slightly changed for some very specific combination of inputs? Or even if I remove a function like global_add_c_prefix that no one is likely using? And even if they are (1) it's probably not super critical since it's just formatting results for humans to read and (2) the fix will be very easy. If I take the conservative approach suggested by answers to Is a small breaking change enough to do a Major Version Upgrade while using SemVer? then the answer is yes, I should major version bump. But if this is the approach I take then I'll want to be even more conservative before releasing 1.0.0 and it will take even longer to get it out!

My feeling is that given (1) sciform is new and small and I'm not sure if anyone is even really using it yet, (2) the nature of sciform as a being an endpoint for making text human readable and (3) the scope of the planned changes, I think it would be reasonable to take a more liberal interpretation of SemVer major version bumps to be for changes that majorly breaking... but I would be open to pushback here.

@Batalex @isabelizimm @machow curious for your feedback on

  • The issues linked above (preferably in the issues themselves
  • What philosophy I should take towards releasing sciform at 1.0.0 (maybe I should just resolve the three issues linked above which are the only open breaking issues before 1.0.0)
  • Any other thoughts on the state of sciform re this PyOpenSci review.

@jagerber48
Copy link
Author

jagerber48 commented Jan 10, 2024

Addressed the 3 issues linked in the previous post to release 0.32.0.

That's all the concrete stuff I had on my list before releasing 1.0.0. Open to suggestions on when I should actually bump to 1.0.0. I'm curious to sit with it a little longer myself and see how I feel about it. I may also try to advertise to others and try to get a little more feedback from more people who actually download and try to use the code for some real application. My belief/hope is that the core functionality should actually be pretty solid and has been for a while, and the issues I've been thinking about lately are pretty fringe edge cases that I only know about because I'm so deep in the code.

I capture my thoughts on releasing 1.0.0 in jagerber48/sciform#117

@machow
Copy link

machow commented Jan 11, 2024

What philosophy I should take towards releasing sciform at 1.0.0. ... I may also try to advertise to others and try to get a little more feedback from more people who actually download and try to use the code for some real application.

Regardless of when you bump to v1, I'd definitely +1 advertising it broadly to folks! I'd think about it like this: if you bump to v1 without people using it, you have to guess what the right behaviors / naming is. If you can get people using it, even with the caveat it's v0, you'll have good feedback on potential changes for a v1. Because v0 explicitly signals you might make changes, in a way you're in a good position to advertise it AND think about important tweaks!

@jagerber48
Copy link
Author

Regardless of when you bump to v1, I'd definitely +1 advertising it broadly to folks! I'd think about it like this: if you bump to v1 without people using it, you have to guess what the right behaviors / naming is. If you can get people using it, even with the caveat it's v0, you'll have good feedback on potential changes for a v1. Because v0 explicitly signals you might make changes, in a way you're in a good position to advertise it AND think about important tweaks!

Yes, this is exactly what I'm thinking! I don't want to have to guess at right behaviors / naming which is what it feels like I would be doing if I bumped to v1.

I don't think v0 is at all an impediment to getting people to use it. That is, I think the fractional intersection of people who might be interested in sciform and people who wouldn't use sciform because it is <v1 is very small. I think the bigger impediments are getting people to be aware of sciform (advertising) and whether sciform is actually broadly useful to people once they see it.

So if that's really true maybe I should just cool my horses on v1 and instead commit to something like (and indicate this in the docs) "I'm going to use and advertise sciform for N(=12?) months during which time I'll collect feedback. At the end of that time I'll revisit the v1 question".

@jagerber48
Copy link
Author

jagerber48 commented Jan 12, 2024

Before going into my thoughts/questions about advertising/getting feedback: What is the status of the pyopensci review at this point? I greatly appreciate the continued feedback I'm getting but also don't want to veer the conversation out of scope or off topic. @Batalex? My rough thinking on 1.0.0 is that I don't want to do that until I get feedback. I imagine it will at least take weeks if not months to get some valuable feedback. The PyOpenSci review is a major milestone towards 1.0.0, but I don't think it gets me all the way there, so I think 1.0.0 should be decoupled from the review at this point.


Suggestions for advertising sciform to get users and get feedback? Here's my plan so far

  • I've added a link to a google forms survey where I can gather feedback on the readme which shows up on the main page of the documentation: https://sciform.readthedocs.io/en/stable/
  • I'm going through relevant stack overflow questions that sciform pertains to (mainly stuff like "how to format numbers using engineering notation in python" or "how to round numbers based on sig figs in python") and answering them, providing links to sciform.
  • After this review is accepted I may request the pyopensci announcement to include a note requesting feedback if anyone wants to check it out and give feedback, or I can add that note on myself in slack.
  • I was working on something like sciform a long time ago, and then this thread came up. The upshot was that there should be a PyPi package to to do the type of formatting requested. This gave me the motivation to beef what I was working on into the full package sciform. I've already linked sciform in that thread. Considering bumping that thread with the update that sciform does a lot of what was requested and is more mature now. HOWEVER I don't know if this would be an appropriate advertisement or if it would be seen as bumping a thread for too much self promotion.
    • A slight side note: There is still a part of me that is interested in getting some of the sciform features into the built in format specification mini language. Notably I think using ! to indicate reproducible sig fig rounding and using an r format type to indicate engineering notation would be highly impactful for scientists and I think could be compatible extensions to the built in FSML. (Noting that the sciform FSML is not strictly an extension of the python FSML). Were I to bump that thread I would probably also include a statement of my ideas in this regard as well.
  • I am not active on r/python but it seems that that could be a natural place to post "Hey here is my open source package, what do people think?"
  • I will likely share with my former research group and maybe some other precision metrology groups I know who may be interested in this package.

How do these ideas look? Where else can/should open source packages be announced?


Further thoughts on the python discourse post: I think it would be inappropriate to revive that thread. I just looked through it a little bit. It was originally about adding SI/IEC prefixes into the built in FSML and then it kind of got hijacked (partly by me) into talking about engineering notation and sig figs. I think the thread does clearly discover where the python FSML falls short with respect to sig figs. The thread is originally about adding prefixes, and the OPs package prefixed already does that, so I don't think sharing sciform (which does prefixes and more) makes sense as a new bump to that thread. If anything I could make a new thread proposing the sig fig + engineering notation ideas and mention sciform in that thread as an existing PyPi package (among some others) that currently provides the desired functionality.

@jagerber48
Copy link
Author

jagerber48 commented Jan 12, 2024

Another breaking variable rename recently occurred to me while I was answering a stack overflow question

fill_char -> left_pad_char
fill_char -> left_pad_fill_char

jagerber48/sciform#126

This would be much more consistent with its option partners left_pad_dec_place and left_pad_matching.

@Batalex
Copy link
Contributor

Batalex commented Jan 13, 2024

Hey @jagerber48,
If you are set on us finishing this review before releasing 1.0.0 so that it comes after you gather feedback, that is fine by me.

@machow, @isabelizimm I need you to check

Final approval (post-review)

- [  ] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

in your reviews above.

As for how to advertise sciform and how to gather feedback:

  • pyopensci's announcement is a great place to start
  • r/python as well
  • you could try to reach the folks @ realpython to include sciform in their newsletter "Pycoders"
  • there might be specialized discourses for engineering with Python?

@isabelizimm
Copy link
Contributor

isabelizimm commented Jan 16, 2024

Hey there-- I have given a ✅ on my review. The plans and notes above for advertising seem like a solid start; if you're involved in other Python discord/slack/etcs, there's often a place to show off work. ++ on v1 being a personal choice, but that it is never too early to solicit feedback!

This package is in a solid state, and @jagerber48 should be really excited and proud of building it up!

@jagerber48
Copy link
Author

If you are set on us finishing this review before releasing 1.0.0 so that it comes after you gather feedback, that is fine by me.

To be clear, yes, I don't want to wait for 1.0.0 before the review is finished.

I still have open questions about various idea/features etc. But at this point I think they're out of scope for this review and should be treated as typical issues for the sciform package. I may move on to asking them in pyopensci discourse or slack (and elsewhere on the internet) if that makes sense. Does that sound right?

@Batalex
Copy link
Contributor

Batalex commented Jan 24, 2024

Ok, let's do this!

@machow, Could you please check the final approval in your review so that we know you are finished with this submission? Thanks!

@machow
Copy link

machow commented Jan 24, 2024

Done! Sorry for the wait!

@jagerber48
Copy link
Author

jagerber48 commented Jan 28, 2024

@Batalex has let me know the review will be wrapping up and asked if there's anything else I'd like to add to the codebase first. A this point there's a continuous stream of small to medium sized improvements I'm making, but I don't think any of them need to tie up or be tied up in the review.

There is one thing I'd mention though. @Batalex suggested a while ago that sciform return a class which defines _repr_latex_ and _repr_html_ functions so that tools like IPython or maybe quarto could hook into these functions to allow richer display than pure unicode in contexts where that is possible. Essentially a beefed up version of the old latex option. After some thought I was able to get this implemented. This involved removing the latex option and introducing the FormattedNumber class which is returned by the Formatter. I wanted to bring this to the attention of @isabelizimm and @machow in case they find it interesting.

See

The feature is merged into main now but has yet to be released. I'll probably release it this week into version 0.33.0.

@jagerber48
Copy link
Author

The previously mentioned FormattedNumber updates and others were just released in 0.33.0. https://github.com/jagerber48/sciform/releases/tag/0.33.0

@Batalex
Copy link
Contributor

Batalex commented Feb 7, 2024


🎉 sciform has been approved by pyOpenSci! Thank you, @jagerber48, for submitting sciform and many thanks to @isabelizimm and @machow for reviewing this package! 😸

We took longer than I expected at first glance, but I think it is for the best. The package was already in a good shape, but it is now definitely more welcoming to new contributors, hence more future-proof.

Author Wrap Up Tasks

There are a few things left to do to wrap up this submission:

  • Activate Zenodo watching the repo if you haven't already done so.
  • Tag and create a release to create a Zenodo version and DOI.
  • Add the badge for pyOpenSci peer-review to the README.md of . The badge should be [![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/121).
  • Please fill out the post-review survey. All maintainers and reviewers should fill this out.

Editor Final Checks

Please complete the final steps to wrap up this review. Editor, please do the following:

  • Make sure that the maintainers filled out the post-review survey
    @jagerber48: Please let me know once you are done.
  • Invite the maintainers to submit a blog post highlighting their package. Feel free to use / adapt language found in this comment to help guide the author.
  • Change the status tag of the issue to 6/pyOS-approved6 🚀🚀🚀.
  • Invite the package maintainer(s) and both reviewers to slack if they wish to join.
  • If the author submits to JOSS, please continue to update the labels for JOSS on this issue until the author is accepted (do not remove the 6/pyOS-approved label). Once accepted add the label 9/joss-approved to the issue. Skip this check if the package is not submitted to JOSS.
  • If the package is JOSS-accepted please add the JOSS doi to the YAML at the top of the issue.

If you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and documentation in the peer-review-guide.

@jagerber48
Copy link
Author

@Batalex Ok, I completed all those items. Setup Zenodo and was able to do a release to get a DOI (added a badge to readme). Also added PyOpenSci badge (very excited about that!). Completed the maintainer survey.

@Batalex @machow @isabelizimm Thank you all so much for your help, patience, and time spent on this review. This has been a very valuable learning experience for me and I think sciform is very much improved thanks to all of your feedback. My goal is for sciform to be something that can help people and I feel like going through this review process really helped me towards that goal!

@jagerber48
Copy link
Author

@Batalex how do I add sciform and myself to the PyOpenSci website?

@Batalex
Copy link
Contributor

Batalex commented Feb 17, 2024

It should be good: https://github.com/pyOpenSci/pyopensci.github.io/blob/main/_data/packages.yml#L110-L150 but the CI job failed: https://github.com/pyOpenSci/pyopensci.github.io/actions/runs/7909228977/job/21589918535.

@lwasser I am not familiar with Jekyll, is this a one-off error, or is there something more to it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: pyos-accepted
Development

No branches or pull requests

5 participants