Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for the rest of the char types #192

Closed
wants to merge 9 commits into from

Conversation

Pharago
Copy link
Contributor

@Pharago Pharago commented Apr 2, 2023

No description provided.

Added support for the other char types
fastfloat_really_inline FASTFLOAT_CONSTEXPR20
uint32_t parse_eight_digits_unrolled(const char *chars) noexcept {
uint32_t parse_eight_digits_unrolled(TCH const * chars) noexcept {
Copy link
Member

@lemire lemire Apr 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This routine expects ASCII/UTF-8. It will fail on UTF-8 inputs, for example. (Update: I was wrong.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_u64 handles the conversion of integer size between char types

Copy link
Member

@lemire lemire Apr 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected. However, it would still be be needed to check that it makes sense to do so: it is a very cheap check given ASCII inputs, but if you need to do more work, it may end up being a net negative. So we would need to benchmark with and without this optimization to make sure it is warranted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, my reasoning for all of this is that I needed an unicode version, so I don't have to double buffer with fast or wcstod, now I'm going to add hex support using radix bases, and finally an ignore list, as for the tests I'll update them as I go, I guess I'll add a new enum 'infer' to chars_format to work as a pseudo drop-in replacement for wcstod behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Great work. What you are doing is perfectly reasonable.

Ignore my initial negative and misguided comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, it's such a delicate change, and everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected. However, it would still be be needed to check that it makes sense to do so: it is a very cheap check given ASCII inputs, but if you need to do more work, it may end up being a net negative. So we would need to benchmark with and without this optimization to make sure it is warranted.

I have a version of fast_float in which I've attempted to optimize for other char_types using opt-in SIMD. It also adds support for different parsing rules (JSON parsing rules, opt-in inf/nan etc.) --> if this is okay I'm happy to submit a PR or modify this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mayawarrier Please do so.

@lemire
Copy link
Member

lemire commented Apr 2, 2023

Thanks. We are going to need tests for all these types before we can consider this.

@lemire
Copy link
Member

lemire commented Apr 2, 2023

Note that it looks highly valuable, so if we can manage to have decent testing, then we should definitively merge this.

@lemire
Copy link
Member

lemire commented Apr 6, 2023

@Pharago There is no harm in adding your name to the contributor list as part of this PR.

Removed storage class declarations from explicit template specializations of string constants
@lemire
Copy link
Member

lemire commented Apr 8, 2023

Running tests.

@lemire
Copy link
Member

lemire commented Apr 10, 2023

The Ubuntu 18.04 failure is probably because Ubuntu 18.04 is no longer supported in CI. I need to remove it.

@lemire
Copy link
Member

lemire commented Apr 10, 2023

@Pharago We are going to need tests for this. We don't need to reproduce all tests (it would take a long time) but at least some of the tests should be extended to include different char types.

@Pharago Pharago closed this Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants