# Numerical Methods 02: A First Foray into the Forest

## Gabriel M Steward

### January 2023

<a id='toc'></a>

# Table of Contents
$$\label{toc}$$

[Problem 1](#P1) (The only one)

![image.png](attachment:image.png)

<a id='P1'></a>

# Problem 1 \[Back to [top](#toc)\]
$$\label{P1}$$

![image-2.png](attachment:image-2.png)

Okay so first of all, for all five of them we can immediately see the sign: 0 is positive, 1 is negative, as negative 1 to the power of zero is 1, and the power of 1 doesn't do anything to the base so it is negative one. Thus, i-iii are positive, and both iv and v are negative. 

The exponent is next, taking up two bits. For i through iii, this is 11. So X = $2^0 + 2^1$ = 3. Now, this is just the number stored for the exponent, the actual exponent is X-2 as we see from the definition, so we have exponents of 1 for the first three. For iv we have 00, so X=0 rahter obviously, and X-2 = -2. For v we ahve 10, or X=2, making the exponent itself be 0. 

Now with the simple stuff out of the way, we handle each case separately. 0100 is the mantissa for i. The part that uses the mantissa, the parenthetical in the definition of N, always has a trailing 1 in front of it and then verious negative powers of two. Essentially, the number is 1.0100 here. So in decimal terms, it would be $1+2^{-2}$ or 1+1/4 or 1.25. This is multiplied by our exponent we found earlier, $2^1$, so we end up with 2.50

$$ \boxed{N_i = 2.5e0} $$

Next we have a mantissa of 0101, which correlates to the actual binary number 1.0101, which we can represent as $1 + 2^{-2} + 2^{-4}$ or $1 + 1/4 + 1/16 = 1 + 0.25 + 0.0625 = 1.3125$. 1/16 is nicely exactly representable in base 10. 

Our exponent is the same as before, so we just multiply by 2, getting 2.625. Which is barely larger than the previous number, which is exactly what we should expect. 

$$ \boxed{N_{ii} = 2.625e0} $$

Mantissa of 1100, corresponding to 1.1100, and then to $1 + 1/2 + 1/4 = 1.75$. The exponent is still the same, so we multiply by 2 and get 3.50. 

$$ \boxed{N_{iii} = 3.5e0} $$

Now things change a bit. First of all, the number is negative for iv. Secondly, our exponent is now -2, so at the end we have a $2^{-2}$ or 1/4 to multiply by. 

The mantissa is 0010, corresponding to 1.0010, which we convert into $1 + 2^{-3} = 1 + 1/8 = 1.125$. Multiplied by 1/4, this gives us 0.28125. Naturally we need this in proper scientific notation for the answer, so...

$$ \boxed{N_{iv} = -2.8125e-1} $$

Negative result for v. The exponent is just 0, so we'll be multiplying by one, aka doing nothing.

The mantissa is 0011, corresponding to 1.0011, which becomes $1 + 1/8 + 1/16 = 1.1875$. 

$$ \boxed{N_v = -1.1875e0} $$

![image-2.png](attachment:image-2.png)

We have somewhat of a clue in the fifth result of part a), for it is close to 1, just with some added stuff to it. So we take the same exponent 10, that is $2^0$ and we make the sign bit positive. Since there is an implied leading one at the front, all we need to do... is set everything else to zero.

$$ \boxed{1_{base10} = 0100000} $$

We can also write the smallest positive number similarly, just lower the exponent. 0000000. We note that since this is all zeroes it may be overidden to *be* zero, and thus with this excluded the smallest positive number would be 0000001. 

Largest is much easier. Maximize everything. 0111111. If this is for some reason set to +infinity, then 0111110 will do. 

Change the sign bit to get the negative versions. 

![image-2.png](attachment:image-2.png)

Not sure if there's really work to show here, it's mostly just adjusting indices. $b_{9}$ is the sign bit, $b_{8-4}$ are the exponents, and $b_{3-0}$ are the mantissa. 

$$ N = (-1)^{b_9} \left( 1 + \sum_{i=1}^4 b_{4-i} 2^{-i} \right) 2^{X-16} $$

$$ X = \sum_{i=0}^4 b_{4+i}2^i $$

![image.png](attachment:image.png)

So, there are probably several ways to do this, but we shall consider fractions. We have a four-bit mantissa, which means (with the leading 1) we get five binary digits. 

Let's start by converting 1/10 to binary piece by piece. In actual binary, we have nothing in the 1s place, or the 1/2 place, or the 1/4, or the 1/8, the first place we get anything is 1/16, so 0.0001. Which means to fill every bit we have, we need to go to 0.0001XXXX, that is, the 1/32nd place, the 1/64th place, the 1/128th place, and the 1/256th place. 

1/32 + 1/16 = 3/32, which is still less than 1/10, so it has a value of 1. 

1/64 + 3/32 = 7/64, greater than 1/10, this place has a value of 0.

1/128 + 6/64 = 13/128, greater than 1/10, thus a value of 0.

1/256 + 12/128 = 25/256, less than 1/10, so we have a place here. Our number is therefore 0.00011001.

Since we may have an issue with rounding potentially, go one step further. 1/512 + 25/256 = 51/512, another 1 place. This heavily implies that 1/10 is 0.000110011001100110011... repeating forever, but as far as we are concerned we consider it 0.000110011. In scientific notation, (in binary) this is 1.10011e-4.

First, this is positive, so the sign bit has to be 0.

Secondly, we want the exponent to be $2^-4$ so X has to be equal to 12, for 12-16 = -4. 12, as an integer, is not hard to represent in binary: 1100. Since we have a fifth digit our exponent will actually be 01100. 

Then we get to the mantissa. The question is, do we truncate or round up? Argumetns can be made for both 1.1001 and 1.1010, given the fact that we have a bit after the last one in the 512ths place. However, we can still know something: our last digit was found with the fraction 51/512, which is *less than* 1/10. Meaning that there have to be more digits afterward, meaning that overall 1/10 is *larger*, so we should round up. 

Therefore, the binary mantissa should be 1.1010. (If we just truncated it would be 1.1001). The leading 1 is assumed, so our mantissa is 1010. 

Putting it all together we get...

$$ 1/10 \approx 0011001010 $$

With 0011001001 being further below 1/10 than the given answer is above it. 

![image.png](attachment:image.png)

Greater than one, not zero? Sign bit is positive. 1 would be given by X-16 = 0, so X=16 is given by an exponent of 10000. Mantissa for just 1 would be 0000, so the next largest number is 0001. All together we get...

$$ 1 < 0100000001 $$

The smallest number larger than one. 

![image.png](attachment:image.png)

So, from part iii) we can deduce what the number *just larger* than machine $\epsilon$ is, which in binary would be 0.0001, or 1e-4 (again in binary). 

Just *barely* less than 0.0001 would be 0.000011111, or 1.1111e-5. The question is, can we actually represent this number? Well, the exponent needs X-16=-5, so X=11 is what we want, or 01011 for the exponent. Sign must be positive, of course. The mantissa would be all 1s, 1111. 

Thus...

$$ \epsilon = 0010111111 $$

This is basically saying "well, since the smallest number we can detet is 1, let's lower it to 0.99999 to make it so we can't detect it." All we did was perform a sort of "subtraction" in binary out to how many digits we could store in the mantissa. If we had a larger mantissa, we'd just add more 1s until we reached the end. 

One could also think of it as approaching a limit--if we keep adding 1s to the end we'll get closer to but never actually make the digits roll over! 