Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MWRAPPER-123] hash string char-by-char for ksh implementations #117

Merged
merged 1 commit into from
Feb 5, 2024

Conversation

bmarwell
Copy link
Contributor

@bmarwell bmarwell commented Feb 1, 2024

AIX's ksh's printf %d implementation cannot handle full string input.

Thus, create the hash char-by-char. This is what the implementation would have done anyway.


Following this checklist to help us incorporate your
contribution quickly and easily:

  • Make sure there is a JIRA issue filed
    for the change (usually before you start working on it). Trivial changes like typos do not
    require a JIRA issue. Your pull request should address just this issue, without
    pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [MWRAPPER-123] - Fixes bug in ApproximateQuantiles,
    where you replace MWRAPPER-XXX with the appropriate JIRA issue. Best practice
    is to use the JIRA issue title in the pull request title and in the first line of the
    commit message.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will
    be performed on your pull request automatically.
  • You have run the integration tests successfully (mvn -Prun-its clean verify).

If your pull request is about ~20 lines of code you don't need to sign an
Individual Contributor License Agreement if you are unsure
please ask on the developers list.

To make clear that you license your contribution under
the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.

@bmarwell bmarwell added the bug Something isn't working label Feb 1, 2024
@bmarwell bmarwell marked this pull request as draft February 1, 2024 07:55
@bmarwell bmarwell changed the title [MWRAPPER-123] replace bogus printf %d with constant 104 for now [MWRAPPER-123] hash string char-by-char for AIX's ksh implementation Feb 1, 2024
@bmarwell bmarwell force-pushed the constant_digit_conversion branch 2 times, most recently from 63eb1da to 8fca98a Compare February 1, 2024 09:58
@bmarwell bmarwell marked this pull request as ready for review February 1, 2024 09:58
Copy link

@rmannibucau rmannibucau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for me but kind of yet another proof we don't want to have a shell script with any logic so I would move to a java/jsh or anything else script since wrapper will no more be used with older jvm version

@bmarwell
Copy link
Contributor Author

bmarwell commented Feb 1, 2024

ok for me but kind of yet another proof we don't want to have a shell script with any logic so I would move to a java/jsh or anything else script since wrapper will no more be used with older jvm version

100 %

I still don't like the subshell (| cut ). If anyone can come up with a better solution, let us know.

@slawekjaranowski
Copy link
Member

My test:

hash_string() {
  str="${1:-}" h=0
  while [ -n "$str" ]; do
    c="${str%${str#?}}"
    h=$(( ( h * 31 + $(LC_CTYPE=C printf %d "'$c") ) % 4294967296 ))
    str="${str#?}"
  done
  printf %x\\n $h
}

@bmarwell can you try on AIX?

@bmarwell
Copy link
Contributor Author

bmarwell commented Feb 2, 2024

My test:

hash_string() {
  str="${1:-}" h=0
  while [ -n "$str" ]; do
    c="${str%${str#?}}"
    h=$(( ( h * 31 + $(LC_CTYPE=C printf %d "'$c") ) % 4294967296 ))
    str="${str#?}"
  done
  printf %x\\n $h
}

@bmarwell can you try on AIX?

sadly...

# vi test.sh
# ksh test.sh "https://reopsitory/repository/maven-central/org/apache/maven/apache-maven/3.9.6/apache-maven-3.9.6-bin.zip"
printf: 2589392689: A return value of a math subroutine is not within machine precision.
7fffffff

@slawekjaranowski
Copy link
Member

just out of curiosity, what result will be:

echo 'T="abcde"; echo "${T#?}"' | bash  

echo 'T="abcde"; echo "${T%${T#?}}"' | bash

@bmarwell
Copy link
Contributor Author

bmarwell commented Feb 2, 2024

# echo 'T="abcde"; echo "${T#?}"' | bash
bcde

# echo 'T="abcde"; echo "${T%${T#?}}"' | bash
a

@slawekjaranowski
Copy link
Member

# echo 'T="abcde"; echo "${T#?}"' | bash
bcde

# echo 'T="abcde"; echo "${T%${T#?}}"' | bash
a

I thought about test on AIX with ksh ...

@bmarwell
Copy link
Contributor Author

bmarwell commented Feb 3, 2024

# echo 'T="abcde"; echo "${T#?}"' | bash
bcde

# echo 'T="abcde"; echo "${T%${T#?}}"' | bash
a

I thought about test on AIX with ksh ...

# echo 'T="abcde"; echo "${T#?}"' | ksh
bcde

echo 'T="abcde"; echo "${T%${T#?}}"' | ksh
a

I think I know what you have in mind...

@bmarwell
Copy link
Contributor Author

bmarwell commented Feb 3, 2024

I think we can replace this

  # strings start with index 1
  idx=1

  # lt: skip line break at end
  while [ $idx -lt "$length" ]; do
    char=$( printf "%s" "$str" | cut -c $idx )
    h=$(( ( h * 31 + $(LC_CTYPE=C printf %d "'$char") ) % 4294967296 ))
    idx=$(( idx + 1 ))
  done

with this:

  while [ "$str" != "" ]; do
    char="${str%${str#?}}"
    h=$(( ( h * 31 + $(LC_CTYPE=C printf %d "'$char") ) % 4294967296 ))
    str="${str#?}"
  done

@bmarwell bmarwell force-pushed the constant_digit_conversion branch 6 times, most recently from daf7292 to 6b8cd96 Compare February 3, 2024 12:18
@bmarwell bmarwell merged commit e16efcb into master Feb 5, 2024
51 checks passed
@bmarwell bmarwell deleted the constant_digit_conversion branch February 5, 2024 08:12
@bmarwell bmarwell changed the title [MWRAPPER-123] hash string char-by-char for AIX's ksh implementation [MWRAPPER-123] hash string char-by-char for ksh implementations Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants