Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: support hex floats #14

Closed
ghost opened this issue May 7, 2020 · 14 comments
Closed

Enhancement: support hex floats #14

ghost opened this issue May 7, 2020 · 14 comments

Comments

@ghost
Copy link

ghost commented May 7, 2020

Hexadecimal floats are a formatting for floating point values supported in C since C99. It shows the mantissa in hex. This is useful because it shows the exact number with no rounding or decimal approximation.

E.g. 0.486224 is 0x1.eff2bp+6.

@ghost ghost changed the title Feature requests: support hex floats Feature request: support hex floats May 7, 2020
@lcn2
Copy link
Owner

lcn2 commented Feb 3, 2021

I like this idea. Anyone want to take a crack as modifying the parser to permit such Hexadecimal floats?

@kcrossen
Copy link

kcrossen commented Mar 1, 2022

`
// char* hex_float_c_str;
// PCRE validator of input:
// "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?"
// Test examples
// char* example_hex_float = "+0x1.921fb54442d18p+0001";
// char* example_hex_float = "+0x0.0000000000000p+0000";
// char* example_hex_float = "+0x0.0000000000001p+0000";
// char* example_hex_float = "+0x1.0p+0000";
// char* example_hex_float = "+0x1.0";
// char* example_hex_float = "+0x1";

    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    long long numerator = 0;
    int mantissa_power_of_2 = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    // Keeping it simpler later in code
    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if (example_hex_float.mid(ch_idx, 2) == "0x") {
        ch_idx += 2;
        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < example_hex_float.length()) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                mantissa_power_of_2 -= 4;
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                mantissa_power_of_2 -= 4;
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < example_hex_float.length()) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);

        long long power_multiplier = 1;
        power_multiplier = power_multiplier << abs(power_of_2);
        if (power_of_2 > 0) {
            numerator = numerator * power_multiplier;
        }
        else if (power_of_2 < 0) {
            denominator = power_multiplier;
        }
    }

    numerator = sign * numerator;

`

@kcrossen
Copy link

kcrossen commented Mar 1, 2022

Testing usefulness in Calc (my version):

    QString test_commands = "";
    test_commands += QString("test_value=") + QString::number(numerator) + "/" + QString::number(denominator) + ";";
    RPN_Commands_Execute(test_commands);

    QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));

    qDebug() << test_result;

@kcrossen
Copy link

kcrossen commented Mar 1, 2022

The above code will "overflow" or "underflow" because of the limitations of long long, so:
`
// char* hex_float_c_str;
// PCRE validator of input:
// "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?"
// Test examples
// char* hex_float_c_str = "+0x1.921fb54442d18p+0001";
// char* hex_float_c_str = "+0x0.0000000000000p+0000";
// char* hex_float_c_str = "+0x0.0000000000001p+0000";
// char* hex_float_c_str = "+0x1.0p+0000";
// char* hex_float_c_str = "+0x1.0";
// char* hex_float_c_str = "+0x1";
// Max:
// char* hex_float_c_str = "+0x1.fffffffffffffp+1023";
// Min:
// char* hex_float_c_str = "+0x1.0000000000000p-1074";

    int hex_float_c_str_length = strlen(hex_float_c_str);
    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    long long numerator = 0;
    int mantissa_power_of_2 = 0;
    int mantissa_digit_count = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if ((ch_idx < (hex_float_c_str_length - 1)) and
        (hex_float_c_str[ch_idx] == '0') and
        (hex_float_c_str[ch_idx + 1] == 'x')) {
        ch_idx += 2;
        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
    }

    QString test_commands = "test_value=(";
    if (sign < 0) test_commands += "-1*";
    test_commands += QString::number(numerator);
    if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
    test_commands += ")/(";
    test_commands += QString::number(denominator);
    if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
    test_commands += ");";
    RPN_Commands_Execute(test_commands);

    QString test_result = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));

    qDebug() << test_result;

`
The individual components, mantissa and power, if following standard, will stay within the range of long long. Excess mantissa digits are ignored if after the radix mark or effectively replaced with zeros if before the radix mark.

@kcrossen
Copy link

kcrossen commented Mar 1, 2022

Added test for "overflow":
// char* hex_float_c_str = "+0x1.921fb54442d18abcdefp+0001";

@kcrossen
Copy link

kcrossen commented Mar 3, 2022

Expand value range of tolerated hex floats by 16X:
`void
Test_Hexadecimal_Float_Parse ( QString Hexadecimal_Float_String ) {
// PCRE validator of input:
QRegExp validate_hexadecimal_float = QRegExp("[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?", Qt::CaseInsensitive);

if (validate_hexadecimal_float.exactMatch(Hexadecimal_Float_String)) {
    QByteArray example_hex_float_ba = Hexadecimal_Float_String.toLocal8Bit();
    char* hex_float_c_str = (char*) malloc(example_hex_float_ba.count() + 10);
    strncpy(hex_float_c_str, example_hex_float_ba.data(), example_hex_float_ba.count());

    int hex_float_c_str_length = strlen(hex_float_c_str);
    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    unsigned long long numerator = 0;
    int mantissa_power_of_2 = 0;

#define maximum_mantissa_digit_count 15
int mantissa_digit_count = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if ((ch_idx < (hex_float_c_str_length - 1)) and
        (hex_float_c_str[ch_idx] == '0') and
        (hex_float_c_str[ch_idx + 1] == 'x')) {
        ch_idx += 2;
        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    unsigned long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
    }

    QString test_commands = "test_value=(";
    if (sign < 0) test_commands += "-1*";
    test_commands += QString::number(numerator);
    if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
    test_commands += ")/(";
    test_commands += QString::number(denominator);
    if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
    test_commands += ");";

    // RPN_Commands_Execute executes the argument command string for its side effects ...
    // ... i.e. Calc's internal state (variables).
    // It doesn't care about the results unless there is an error.
    RPN_Commands_Execute(test_commands);

    // Calc_Evaluate executes the argument command string and returns the result
    // Trim_Calc_Result strips the syntactic "sugar" from the returned result.
    QString test_result_internal = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));
    QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));

    qDebug() << "/*--------------------*/";
    qDebug() << Hexadecimal_Float_String;
    qDebug() << test_result;
    qDebug() << test_result_internal;
    qDebug() << "/*--------------------*/";

    free(hex_float_c_str);
}
else {
    qDebug() << "Validation Error: " + Hexadecimal_Float_String;
}

}`

Test code:
Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18p+0001"); Test_Hexadecimal_Float_Parse("+0x0.0000000000000p+0000"); Test_Hexadecimal_Float_Parse("+0x0.0000000000001p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0"); Test_Hexadecimal_Float_Parse("+0x1"); // Defined maximum allowable value: Test_Hexadecimal_Float_Parse("+0x1.fffffffffffffp+1023"); // Defined minimum allowable value: Test_Hexadecimal_Float_Parse("+0x1.0000000000000p-1074"); // Test too many hex digits in mantissa: Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18abcdefp+0001");

Test results:
/--------------------/
"+0x1.921fb54442d18p+0001"
"3.14159265358979311599796346854419"
"884279719003555/281474976710656"
/--------------------/
/--------------------/
"+0x0.0000000000000p+0000"
"0"
"0"
/--------------------/
/--------------------/
"+0x0.0000000000001p+0000"
"0.00000000000000011102230246251565"
"1/9007199254740992"
/--------------------/
/--------------------/
"+0x1.0p+0000"
"1"
"1"
/--------------------/
/--------------------/
"+0x1.0"
"1"
"1"
/--------------------/
/--------------------/
"+0x1"
"1"
"1"
/--------------------/
/--------------------/
"+0x1.fffffffffffffp+1023"
"179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368"
"179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368"
/--------------------/
/--------------------/
"+0x1.0000000000000p-1074"
"0"
"1/202402253307310618352495346718917307049556649764142118356901358027430339567995346891960383701437124495187077864316811911389808737385793476867013399940738509921517424276566361364466907742093216341239767678472745068562007483424692698618103355649159556340810056512358769552333414615230502532186327508646006263307707741093494784"
/--------------------/
/--------------------/
"+0x1.921fb54442d18abcdefp+0001"
"3.14159265358979311599796346854419"
"884279719003555/281474976710656"
/--------------------/

I've tried to use something approximating usual C style (excepting the use of array notation).

Looking at Calc parsing, it looks to be well beyond my competence to fully integrate this code at the parsing level. And of course, integrated at that level, one could support quadruple hex floats, etc.

Have fun.

@pmetzger
Copy link

pmetzger commented Mar 3, 2022

@kcrossen May I ask why you submitted all this code as comments? There is a merge request facility?

@Saldef
Copy link

Saldef commented Mar 18, 2022

I think Fabrice Bellard has already done it long ago with his numcal app, along with a lot of other features, check it out here :
http://numcalc.com/

@kcrossen
Copy link

kcrossen commented Mar 23, 2022

I don't understand the bulk (nearly any of) of the parsing code, which makes the usual form of posting this problematic.

@kcrossen
Copy link

Furthermore, I don't know how to use the relevant github tools (which I use for my own code about like I used to use sourceforge).

@lcn2
Copy link
Owner

lcn2 commented Mar 24, 2022

Calc is maintained on GitHub, not sourceforge. GitHub has lots of good documentation that you should consider.

@pmetzger
Copy link

pmetzger commented Apr 5, 2022

@kcrossen Github is mostly just git plus a web interface.

@lcn2
Copy link
Owner

lcn2 commented Apr 8, 2022

We hope to address this, perhaps sometime next month, in a 2.14.1.x non-production release.

@lcn2 lcn2 changed the title Feature request: support hex floats Enhancement: support hex floats Mar 6, 2023
@lcn2
Copy link
Owner

lcn2 commented Oct 4, 2023

This issue will be part of calc v3: see issue #103. Closing this issue so that any further discussion may occur under issue #103

@lcn2 lcn2 closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants