Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Bypass Math Overflow for Large Computations #66

Closed
STSMHQ opened this issue Mar 16, 2023 · 12 comments
Closed

[QUESTION] Bypass Math Overflow for Large Computations #66

STSMHQ opened this issue Mar 16, 2023 · 12 comments

Comments

@STSMHQ
Copy link

STSMHQ commented Mar 16, 2023

Hi, there,

I love your command line tool and your blog (I read all your posts). You're an amazing developer. I hope one day I can be like you.
I'm opening this issue to ask you if there is any way to bypass the Math error: overflow: number cannot fit error on your calculator. I would like to test it for benchmark purposes against Julia REPL (for example), but your bc gives me this error when the number has too many digits.

Example Calculation: ((169287^137)^920)^13256118217109.

Regards,
STSM

@gavinhoward
Copy link
Owner

Hello,

Thank you for the compliments!

I'm opening this issue to ask you if there is any way to bypass the Math error: overflow: number cannot fit error on your calculator.

It depends.

On the particular example you gave, my machine did not give the error. (I am running an x86_64 Linux machine.)

The error happens in this case because bc limits exponents to the maximum that can fit in the unsigned long type. There are good reasons for this, not least of which calculating a power with an exponent larger than that takes a long time.

For example, I started running the example calculation you gave seven hours ago, and it's still going.

The other reason is that the exponent needs to be converted into an unsigned long to do the actual calculation, which involves running a loop.

By the way, this is also the reason that exponents need to be integers.

Anyway, one way to get around the limit would be to get a computer with a 64-bit unsigned long type. Machines that may not have that could include Windows (even if 64-bit) or 32-bit machines with any OS.

As an example, here are the limits for my machine (found by the limits keyword):

$ bc
>>> limits
BC_LONG_BIT      = 64
BC_BASE_DIGS     = 9
BC_BASE_POW      = 1000000000
BC_OVERFLOW_MAX  = 18446744073709551615

BC_BASE_MAX      = 1000000000
BC_DIM_MAX       = 18446744073709551614
BC_SCALE_MAX     = 18446744073709551614
BC_STRING_MAX    = 18446744073709551614
BC_NAME_MAX      = 18446744073709551614
BC_NUM_MAX       = 18446744073709551614
BC_RAND_MAX      = 18446744073709551615
MAX Exponent     = 18446744073709551615
Number of vars   = 18446744073709551614
>>> quit

There are two to notice: BC_LONG_BIT and MAX Exponent.

BC_LONG_BIT is 64, which means bc was built assuming a 64-bit unsigned long type. MAX Exponent is the actual limit to the exponent, which equals (2^64)-1, which is ULONG_MAX on a machine with a 64-bit unsigned long.

However, if I tell my bc to assume a 32-bit unsigned long when I build it, using:

$ LONG_BIT=32 ./configure <args>
$ make

Then the limits look different:

$ bin/bc
>>> limits
BC_LONG_BIT      = 32
BC_BASE_DIGS     = 4
BC_BASE_POW      = 10000
BC_OVERFLOW_MAX  = 4294967295

BC_BASE_MAX      = 10000
BC_DIM_MAX       = 4294967294
BC_SCALE_MAX     = 4294967294
BC_STRING_MAX    = 4294967294
BC_NAME_MAX      = 4294967294
BC_NUM_MAX       = 4294967294
BC_RAND_MAX      = 4294967295
MAX Exponent     = 4294967295
Number of vars   = 18446744073709551614
>>> quit

BC_LONG_BIT is now 32, and MAX Exponent is now 4294967295, which is (2^32)-1. Notice that 4294967295 is less than the last exponent (13256118217109) in your example calculation. This means that if 4294967295 is your MAX Exponent, you will get that error. And I do get that error when trying your example computation with that build.

Now, there is still a way around it, but it requires some math knowledge.

You must know one fact: (x^y)*(x^z) == x^(y+z).

In essence, if you multiply two powers of one number, the effect is the same as adding the exponents of those powers and then calculating the power from the sum of those exponents.

Usually, you would want to just add the exponents and then calculate the power because multiplications would be much more expensive. However, in this case, we can do the opposite: split the exponent (13256118217109) into several numbers and then multiply all of those numbers together.

Let's decide on one exponent to use, which I'll call E. I can divide 13256118217109 by E, and the result is the number of times that I need to multiply the power by itself. However, that will not be quite right because E almost certainly does not divide 13256118217109 evenly, so 13256118217109%E won't be 0. In that case, we'll want to multiply the end result by the power where the exponent is equal to 13256118217109%E.

It will all look like this (in bc code):

scale = 0
s = ((169287^137)^920)
n = s^E
r = n
t = 13256118217109 / E
m = 13256118217109 % E

for (i = 1; i < t; ++i)
{
    r *= n
}

r *= s^m

But even then, we can make it faster. Since we are just multiplying the value by itself many times, we can use power again. This time, the exponents (3086 and 6172) are back within limits, so we're safe.

If we do that, we get:

scale = 0
s = ((169287^137)^920)
t = 13256118217109 / E
m = 13256118217109 % E
n = s^E
r = n^t
r *= s^m

And that should give us our answer.

We should set those exponents as best as we can to make it as efficient as possible.

I know that my bc uses the binary representation of the exponent to calculate power. Every bit that's a 1 has an extra multiply.

So if I set one of the powers to (2^32)-1, I know every single bit will be 1, meaning that there will be extra multiplications. However, it will also mean I need to use an exponent of 3086 (13256118217109/((2^32)-1)), whereas if I set one of the powers to be 2^31, there will only be a single 1 bit, but I will need to use an exponent of 6172.

An exponent of 6172 has the same amount of 1 bits as 3086 (it's just 3086 times 2), so it will require only one more multiply. However, 2^31 will require 30 less multiplies than (2^32)-1, so we'll go with that.

That gives us this code:

scale = 0
e = 2^31
s = ((169287^137)^920)
t = 13256118217109 / e
m = 13256118217109 % e
n = s^e
r = n^t
r *= s^m

I haven't tested the above code because it will take forever. If there are problems, please let me know.

I would like to test it for benchmark purposes against Julia REPL (for example)

My bc will be slower than Julia. I am confident of that.

This is because my bc does not use the full range of hardware-native integers in order to use decimal-based math. I bet Julia does use the full range and uses binary-based math.

Also, since it's taken seven hours on my bc on a 64-bit machine, I am wildly confident that Julia will be better.

You may want a benchmark with a smaller exponent.

@STSMHQ
Copy link
Author

STSMHQ commented Mar 17, 2023

First of all, I just want to let you know that you're really, really amazing. The passion that you've for your project(s) and for the people that interact with them is something that melts my heart. I hope one day I can be a programmer as good as you and also, more important, have the same amount of passion that you have.

English is not my primary language, so this may not make any sense.

Anyway, one way to get around the limit would be to get a computer with a 64-bit unsigned long type. Machines that may not have that could include Windows (even if 64-bit) or 32-bit machines with any OS.

I'm using a x64 Windows Machine, that's the reason, so. I'll start compiling your bc on Linux using WSL.

You must know one fact: (x^y)*(x^z) == x^(y+z).

In essence, if you multiply two powers of one number, the effect is the same as adding the exponents of those powers and then calculating the power from the sum of those exponents.

I knew that, but somehow I didn't think about it. Thanks for the recap.

My bc will be slower than Julia. I am confident of that.

This is because my bc does not use the full range of hardware-native integers in order to use decimal-based math. I bet Julia does use the full range and uses binary-based math.

Well, that's kinda "chinese" to me. I'm not as good as you to understand that low-level programming stuff

You may want a benchmark with a smaller exponent.

I'll.

On another topic, I'm using your tool to keep track of my expenses. Before closing this issue, I would like to ask you if there is any way that I can use "import" statements in your bc ecosystem.

I know that I can work with more than one file at the same time using the -f/-c flags, but I'm wondering if is possible to access multiple files by only importing one - and use data from other files inside it.

My goal (if that can help you answering it) is to have a single file with functions/operations and then a bunch of other files only with data (expenses and gains) - like a local mini database - that I can call in the "main" bc file and work with.

@gavinhoward
Copy link
Owner

gavinhoward commented Mar 18, 2023

Thank you for your compliment!

Just so you know, I have not always had this passion. But I firmly believe that software is meant to serve people. That's why I focus so much on helping others.

It makes me really happy that you noticed!

I'm using a x64 Windows Machine, that's the reason, so. I'll start compiling your bc on Linux using WSL.

Yeah, sorry about that. Windows is dumb in that they decided that the unsigned long type should remain 32 bits. It was a stupid decision.

Well, that's kinda "chinese" to me. I'm not as good as you to understand that low-level programming stuff

I apologize. I'll try again.

Under the hood, bc uses 32-bit unsigned integers (on 64-bit) as "digits". It turns out that the highest power of 10 that 32 bits can contain is 10^9, or 1,000,000,000. This is one more than the maximum value that bc allows in each 32-bit integer. If it goes higher than this, bc will detect that it went higher and add the extra to the next "digit". This keeps each digit below 1,000,000,000.

However, that is not optimal. I did that because bc needs to have decimal-based math. (In other words, the math needs to be in base 10.) If you can have your math in binary, base 2, it's much cheaper because then you can make your limit 2^32, which is much higher than 10^9 (4,294,967,296 vs. 1,000,000,000).

Julia uses base 2. This means that each "digit" can store more, which means the numbers are smaller (in number of digits). And smaller numbers mean faster execution.

I hope that made sense.

On another topic, I'm using your tool to keep track of my expenses.

I admit that's terrifying to me. I mean, technically, it should be safe, as long as you don't do any division or modulus in your scripts. But I don't want to screw up your finances with bc.

I hope you remember there's no warranty. If there is a bug, I'll try to help, but I can't guarantee that your financial records will be correct in the presence of bugs.

I will guarantee, though, that I know of no bugs. I have fixed all of the ones I am aware of.

Before closing this issue, I would like to ask you if there is any way that I can use "import" statements in your bc ecosystem.

Sort of.

I know that I can work with more than one file at the same time using the -f/-c flags, but I'm wondering if is possible to access multiple files by only importing one - and use data from other files inside it.

Well, that's the only way.

Actually, it's even simpler than that. bc runs scripts in the order given on the command-line. For example, you could run this:

$ bc funcs.bc database.bc main.bc

And bc will run funcs.bc, then database.bc, then main.bc.

This means that anything in funcs.bc can affect the execution of database.bc, and anything in funcs.bc and database.bc can affect the execution of main.bc.

This means that you can chain together scripts in a way that acts like import. Sort of. If you would only import at the top of a file.

Basically, this would mean that if bc had an import statement, the above would be equivalent to putting:

import funcs.bc
import database.bc

at the top of main.bc.

I do this myself; I have a math research project, and when I get new ideas, I start a new script. But I have all of my common functions in a lib.bc script, so I run it like this:

$ bc lib.bc <current_script>

This even works for global variables; if you set a variable food_expenses to say 101.00 at the top level of a script, that variable will equal 101.00 in later scripts.

So if I were you, I would have a funcs.bc script that defines all of the functions. Then I would have a database.bc script that has the current values of all of your expenses. It should look something like this:

food_expenses = 101.00
entertainment_expenses = 45.50

and so on.

Then, in the last script, I would have the code to actually update the database, but instead of directly updating, I would have it print the new "version" of database.bc to stdout.

Then you would run it like this:

$ bc funcs.bc database.bc main.bc > temp.bc

Then check that temp.bc is correct. If it is, you can remove database.bc and rename temp.bc to database.bc:

$ rm database.bc
$ mv temp.bc database.bc

To make the example more concrete, let's assume that you have a funcs.bc that looks like this:

define void add_to_food_expenses(expense) {
    # If you don't put `food_expenses` in an `auto` statement,
    # then the global variable will be affected.
    food_expenses += expense
    # The global `total_expenses` is also affected with an `auto`.
    total_expenses += expense
}

define void print_new_database() {
    print "food_expenses = ", food_expenses, "\n"
    print "total_expenses = ", total_expenses, "\n"
}

Then perhaps you have a database.bc that looks like this:

food_expenses = 101.00
total_expenses = 404.40

Then you have a main.bc that looks like this:

# read() reads from stdin.
expense = read()

add_to_food_expenses(expense)

print_new_database()

Then running the following:

$ echo "35.75" | bc funcs.bc database.bc main.bc

will print:

food_expenses = 136.75
total_expenses = 440.15

which if you redirect to temp.bc:

$ echo "35.75" | bc funcs.bc database.bc main.bc > temp.bc

will make temp.bc a valid replacement for database.bc.

(While it may be safe to redirect straight to database.bc, I would not suggest doing so. If funcs.bc has any print statements that run, database.bc will be overwritten before it is ever read.)

My goal (if that can help you answering it) is to have a single file with functions/operations and then a bunch of other files only with data (expenses and gains) - like a local mini database - that I can call in the "main" bc file and work with.

I think I answered according to your goal. I hope that helps, but please come back with any questions you have.

@STSMHQ
Copy link
Author

STSMHQ commented Mar 21, 2023

Hi. Sorry for the delay replying.

It makes me really happy that you noticed!

Of course. I'm lacking motivation for programming - even in my university project (I'm currently in my last year of my master degree) -, specially now with all the improvements made in the IA field, but reading your blog posts and messages like this one, somehow, gives me motivation to continue.

I would like to suggest, if possible, a way to interact with you in your blog (like a comments section, for example).

I hope that made sense.

It did. I studied it in my first year at university, but I didn't know that specifically, so thank you.

I hope you remember there's no warranty. If there is a bug, I'll try to help, but I can't guarantee that your financial records will be correct in the presence of bugs.

I will guarantee, though, that I know of no bugs. I have fixed all of the ones I am aware of.

And that's all that matters for me. There is nothing that compares to the support that you offer.

I think I answered according to your goal. I hope that helps, but please come back with any questions you have.

You did. Thanks again for the time spent on this. I use a different set of files, but the information that you gave me is going to help me improve my system. So, thanks (one more time xD).

@STSMHQ STSMHQ closed this as completed Mar 21, 2023
@gavinhoward
Copy link
Owner

Sorry for the delay replying.

No worries!

Of course. I'm lacking motivation for programming - even in my university project (I'm currently in my last year of my master degree)

Oh, cool, what is your project? (I dropped out of a Master's; I'm kind of jealous.)

I would like to suggest, if possible, a way to interact with you in your blog (like a comments section, for example).

I have thought about this many times and have always rejected the idea due to the amount of spam-busting and moderation required, at least as told by other bloggers. Oh, and I don't want people to see a buggy site if they have JavaScript turned off as comments nearly always require JavaScript.

However, your suggestion has made me look at it again, and I had an idea.

One of the best pieces of software, at least for forums, is the software underpinning https://lobste.rs/. While I have been banned from that site, I appreciate the software for its moderation capabilities. This includes invite-only registration, which would help a lot to only allow people who want to interact with me and others in a good way.

I've also always wanted to run a forum site where Free Speech was honored as much as possible because I think Free Speech is necessary to find the truth.

So in response to your suggestion, if the lobste.rs software will work for my purposes, I'll spin up an instance at https://forum.gavinhoward.com/ (or something similar), and I'll send you an invite, along with anyone else whose opinions I trust. And every time I post something new, I'll put it on the instance and link to it from the post. That should allow me to have a "comments section" while not interfering with my current site.

I think I'll also allow others to post stuff too. Maybe I'll make its theme to be anything that is intellectually stimulating. I don't know; I'm just spitballing here.

I studied it in my first year at university, but I didn't know that specifically, so thank you.

You're welcome!

And that's all that matters for me.

Thank goodness!

There is nothing that compares to the support that you offer.

You're welcome! And thank you. I am overjoyed to hear this, actually.

I'm starting a business right now, and my product is support for Open Source software that I have written. I was wondering if that would be a good product, but if my support is good, it might work. So I'm thrilled that you think so!

(Don't worry; I'll always support bc for free; it's my gift to the world. The software that I'll be supporting for pay is at https://git.yzena.com/Yzena/Yc.)

Full disclosure: this is also why I took your suggestion for a comments section more seriously; it would be a good way for me to network with potential clients. I apologize for having less than perfect motives.

Also, to take it one step further, can I point to this interaction we had as an example of the support I will give? I think it would be great material to advertise with. No offense if you don't want that though; I know that's a selfish motive.

You did. Thanks again for the time spent on this. I use a different set of files, but the information that you gave me is going to help me improve my system. So, thanks (one more time xD).

You're welcome! Glad I could help!

@STSMHQ
Copy link
Author

STSMHQ commented Mar 22, 2023

Oh, cool, what is your project? (I dropped out of a Master's; I'm kind of jealous.)

You don't need to be. You seem to have the required knowledge to be a teacher on my university. I'm from Portugal, btw.

My project is about telemetry. I'm building a telemetry solution using OpenTelemetry and multiple analysis backends for a health company. The problem is that such project is inside DevOps field and that's not my expertice or even the field that I want to work on. I want to be a programmer, a system programmer to be more precise.

This includes invite-only registration, which would help a lot to only allow people who want to interact with me and others in a good way.

That seems a good way to implement it. I didn't know about that specific website, but after a brief usage of it, I can say that I liked the way that it works.

So in response to your suggestion, if the lobste.rs software will work for my purposes, I'll spin up an instance at https://forum.gavinhoward.com/ (or something similar), and I'll send you an invite, along with anyone else whose opinions I trust. And every time I post something new, I'll put it on the instance and link to it from the post. That should allow me to have a "comments section" while not interfering with my current site.

That makes a lot of sense. I appreciate the consideration for my opinion xD

When that instance is online, let me know, please.

I'm starting a business right now, and my product is support for Open Source software that I have written. I was wondering if that would be a good product, but if my support is good, it might work. So I'm thrilled that you think so!

I see.. something like the commercial support that Daniel Stenberg offers for cURL?

(Don't worry; I'll always support bc for free; it's my gift to the world. The software that I'll be supporting for pay is at https://git.yzena.com/Yzena/Yc.)

That's nice to read. I'm planning to use bc for a very long time :d

Full disclosure: this is also why I took your suggestion for a comments section more seriously; it would be a good way for me to network with potential clients. I apologize for having less than perfect motives.

It wasn't necessary to make that disclosure. You're too kind just for considering having to do it.

Also, to take it one step further, can I point to this interaction we had as an example of the support I will give? I think it would be great material to advertise with. No offense if you don't want that though; I know that's a selfish motive.

Of course you can. I'm flattered to be part of that.

You're welcome! Glad I could help!

You always do. Thanks (one more time)!

@gavinhoward
Copy link
Owner

The problem is that such project is inside DevOps field and that's not my expertice or even the field that I want to work on. I want to be a programmer, a system programmer to be more precise.

Ouch. I feel for you, and I feel lucky.

I know this may not help, but I have personally found that my system programming skills (dev) got better after I learned system admininstration (ops).

System administration is important to me for another reason: without it, I can't host my code or the forum that I mentioned.

Once you get your Master's, maybe you can try to do what I have done: build excellent system software and support it. That may give you the room you need to make it a career. If it is going to work for me, it will surely work for you too!

When that instance is online, let me know, please.

It's up at https://forum.gavinhoward.com/, but email is not working, so I need to wait. Please send me an email using the email listed at https://gavinhoward.com/contact/, and I'll send you an invite once email is working.

I see.. something like the commercial support that Daniel Stenberg offers for cURL?

Yes, basically.

Of course you can. I'm flattered to be part of that.

Thank you!

@STSMHQ
Copy link
Author

STSMHQ commented Mar 29, 2023

Sorry for the delay replying (again).

I know this may not help, but I have personally found that my system programming skills (dev) got better after I learned system admininstration (ops).

I hope you're right.

Once you get your Master's, maybe you can try to do what I have done: build excellent system software and support it. That may give you the room you need to make it a career. If it is going to work for me, it will surely work for you too!

I'm trying to learn Rust on my free time, but it's been hard - I know your opinion about Rust xD -, let's see.

It's up at https://forum.gavinhoward.com/, but email is not working, so I need to wait. Please send me an email using the email listed at https://gavinhoward.com/contact/, and I'll send you an invite once email is working.

The forum returns a 500 error code to me and the contact page shows me that your e-mail is <my_first_name>@<this_website>. I'll send an email to <redacted> and hope it works.

Yes, basically.

I see. Well, good luck for your new project. I'm pretty sure that people will love your support as much as I do.

@gavinhoward
Copy link
Owner

I'm trying to learn Rust on my free time, but it's been hard - I know your opinion about Rust xD -, let's see.

Remember that my Rust opinion is personal; Rust is pretty good in general.

The forum returns a 500 error code to me and the contact page shows me that your e-mail is <my_first_name>@<this_website>.

I'm having trouble with the forum. In fact, it turns out that sending email just doesn't work. My email provider is blaming the lobste.rs software, but the software has the right settings, so...

I'm probably just going to have to write my own forum software. Sorry for the delay. It may be a year or so.

I'll send an email to <redacted> and hope it works.

That is the correct address. I've redacted it because I don't want it machine-readable (to avoid spam), but it is correct.

I see. Well, good luck for your new project. I'm pretty sure that people will love your support as much as I do.

Thank you!

@STSMHQ
Copy link
Author

STSMHQ commented Mar 30, 2023

Remember that my Rust opinion is personal; Rust is pretty good in general.

I hope it is here to stay for the long run.

I'm probably just going to have to write my own forum software. Sorry for the delay. It may be a year or so.

No problem. I'll put a reminder on my calendar, so. See you in a year (or so). Have a nice week!

@TediusTimmy
Copy link

Hey, I realize that you asked this five months ago, but I just saw this and I wanted to chime in, because I love REALLLY BIG NUMBERS.

Now, Gavin noted that (x^y)*(x^z) == x^(y+z). It is also true that (x^y)^z == x^(y*z). So, we can simplify your question to 169287^1670801140084418360. Hoo, boy is that a big number! How big is it? 2.4917161343640611*10^8735990286585912792 The number that describes how many digits are in the number has 19 digits in it.

How did I do this (and what is Julia probably doing under the covers)? The same way a slide rule works: logarithms. It is definitionally true that a^u=e^(ln(a^u)). So a^u=e(u*ln(a)). But this works in any base, so I'll choose base 10 (I hope you remember your laws of logarithms).

To follow along:

scale=40
137*920*13256118217109*l(169287)/l(10)

That should give you that exponent. Remember: the number we want is ten raised to this power. To get the leading digits, we extract the mantissa (the fractional portion of a logarithm), and then raise 10 to this power. This works because of what Gavin noted: we can separate 3.2 into 3+.2, and we are raising ten to this power, so it becomes (10^3)*(10^.2):

e(.3964985643509537948664437764527360339575*l(10))

I want to point out two things: firstly, bc is going to compute all of the digits of this number; secondly, it is a big number. If we pack 19 decimal digits into every 8 bytes, it will still take over 3,000 petabytes of memory to store this number. To give you a sense of scale: at the end of 2021, Internet Archive had 212 petabytes of data archived.

@STSMHQ
Copy link
Author

STSMHQ commented Aug 22, 2023

Hi, @TediusTimmy. Thank you for your explanation. It helped a lot. I didn't know some of that mathematic equivalences. Have a nice week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants