Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add is7BitASCII(s, allowed) to LibString #963

Merged
merged 8 commits into from
Jun 19, 2024
Merged

✨ Add is7BitASCII(s, allowed) to LibString #963

merged 8 commits into from
Jun 19, 2024

Conversation

atarpara
Copy link
Collaborator

Description

closing #962

Checklist

Ensure you completed all of the steps below before submitting your pull request:

  • Ran forge fmt?
  • Ran forge snapshot?
  • Ran forge test?

Pull requests with an incomplete checklist will be thrown out.

@atarpara atarpara changed the title ✨ Add isAlphanumeric and isAlphanumericChar() in LibString.sol ✨ Add isAlphanumeric and isAlphanumericChar in LibString Jun 18, 2024
@atarpara atarpara requested a review from Vectorized June 18, 2024 10:49
Copy link

@PaulRBerg PaulRBerg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice, thank you

@@ -406,6 +406,36 @@ library LibString {
}
}

/// @dev Returns if this string is a alphanumeric character including space.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would rewrite the NatSpec to:

/// @dev Returns true if the string contains only alphanumeric characters and spaces.

}
}

/// @dev Returns if this character is a alphanumeric including space.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, I would rewrite the NatSpec to:

/// @dev Returns true if the character is an alphanumeric character or a space.

@0xCLARITY
Copy link

Left a more detailed comment on the issue: #962 (comment), but I think it would be surprising for spaces to be considered "alphanumeric".

@Vectorized Vectorized changed the title ✨ Add isAlphanumeric and isAlphanumericChar in LibString ✨ Add is7BitASCII(s, allowed) to LibString Jun 18, 2024
@Vectorized
Copy link
Owner

Vectorized commented Jun 18, 2024

Refactored to make it more general.

Will merge in 24 hours if no hard objections.

For compile time evalaution:

function is7BitASCIIAlphaNumericWithSpace(string memory s) internal pure returns (bool result) {
    return LibString.is7BitASCII(
        s, 
        LibString.ALPHANUMERIC_7_BIT_ASCII | uint128(1 << uint256(uint8(bytes1(" "))))
    );
}

Btw, we have a LibString.escapeHTML(s).

@atarpara
Copy link
Collaborator Author

atarpara commented Jun 19, 2024

@Vectorized In the current implementation of the is7BitASCII(string memory s, uint128 allowed) function, The behavior for empty strings is to return true in all cases. This is inconsistent with the behavior observed in similar functions across various programming languages.

For instance:

Python isalnum
Rust's is_ascii_alphanumeric
Govalidator's alphanumeric

In these implementations, the function typically returns false for an empty string. To align with these standards and ensure consistent behavior, I propose the following adjustments:

  1. The function will return false for empty strings, aligning with the default behavior in other languages.
  2. If allowing empty strings is necessary for specific use cases, a separate function can be created. Alternatively, developers can handle this explicitly by checking the length of the string using bytes(str).length == 0.

Everyone is open to give your valuable suggestions.

@Vectorized
Copy link
Owner

Vectorized commented Jun 19, 2024

Let's deviate away from the popular languages.

For this example, it is logically closer to:

Python

all(c in "0123456789" for c in "") # True

Rust

fn is_alphanumeric(s: &str) -> bool {
    s.chars().all(|c| c.is_alphanumeric())
}

Note that Rust's is_alphanumeric operates on characters, not strings.

Rust's philosophy is to provide the most fundamental building blocks which can be composed together via terse syntax.

Due to limited space in Solady, it would be better to lean towards Rust's philosophy.

Also, to keep consistency with LibString.is7BitASCII("") == true. Making this false will feel weird.

Python's isalnum and Govalidator's alphanumericapproach the problem from a HTML input or regex perspective.

isalnum is a super ancient method in Python since the 1.6 days, and the programming paradigm back then was to provide functions that are specialized for HTML input validation instead of for general purpose.

See: https://www.php.net/manual/en/function.ctype-alnum.php

@Vectorized Vectorized merged commit 3d8aa00 into main Jun 19, 2024
9 checks passed
@Vectorized Vectorized deleted the alphanumeric branch June 19, 2024 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants