<h1>Overflow Errors</h1>
<p><img src="images/1line.png" dwidth=100% /></p>


<h3>Buffer Overflow</h3>
<ul>

<li>A buffer overflow occurs when data is written beyond the boundaries of a fixed length buffer overwriting adjacent memory locations which may include other buffers, variables and program flow data.</li>
<li>Considered the &ldquo;nuclear bomb&rdquo; of the software industry, the buffer overflow is one of the most persistent security vulnerabilities and frequently used attacks.</li>
</ul>
<h4>Risk: How Can It Happen?</h4>
<ul>
<li>Writing outside the bounds of a block of allocated memory can corrupt data, crash the program, or cause the execution of malicious code.</li>
<li>Python, like Java, makes an effort to avoid buffer overflow by checking the bounds of a buffer (like an array) and preventing any access beyond those bounds.
<ul>
<li>In a memory unsafe programming language (e.g. C/C++), the program will look at wherever in memory the 11th element would be (if it existed) and try to access it.</li>
</ul>
</li>
<li>No language is perfect, though, so it is essential for all programmers to understand the concepts described below.</li>
</ul>
<h4>Real-world Example:</h4>
<ul>
<li>Buffer overflow vulnerabilities were exploited by the the first major attack on the Internet. Known as the Morris worm, this attack infected more than 60,000 machines and shut down much of the Internet for several days in 1988.<br /><span style="font-size: xx-small;">Carolyn Duffy Marsan. Morris Worm Turns 20: Look what it&rsquo;s Done Network World, October 30, 2008</span></li>
</ul>
<h4>Example in Code:</h4>
<ul>
<li>Though Python allows various ways to create and manipulate arrays, if you use arrays of a predetermined size you may cause the program to throw an IndexError to avoid a buffer overflow.</li>
</ul>

In [2]:
buffer=[None]*10
for i in range(0,14):
    buffer[i]=i
    print(buffer[i])

0
1
2
3
4
5
6
7
8
9


IndexError: list assignment index out of range

<ul>
<li>In the code above, buffer&nbsp;has 10 elements&nbsp;but the loop attempts to writes through 15 elements, which results in an error.</li>
</ul>
<p><em><strong>Code Responsibly&ndash; How Can I Avoid Buffer Overflow?</strong></em></p>
<ul>
<li><em>Make sure you have enough space:</em>
<ul>
<li>Before copying data to a fixed size block, make sure it is large enough to hold the data that you are going to copy.</li>
<li>If it is not large enough, do not copy more data than your available space can hold.</li>
<li>If your program is not able to continue properly after filling the available space, you may have to find some way to recover from the error.</li>
</ul>
</li>
<li><em>Validate indices:</em>
<ul>
<li>If you have an integer variable, verify that it is within the proper bounds before you use it as an index to an array.</li>
<li>This validation is particularly important for any values that might have been provided as user input or other untrusted input, such as information that might be read from a file or from a network connection.</li>
</ul>
</li>
<li><em>Use alternative data structures that reduce the risk of overflows: </em>
<ul>
<li>When possible, use lists in Python without defining the initial size and</li>
<li>Use the .append method to add elements which can reduce your risk of buffer overflow vulnerabilities.</li>
</ul>
</li>
</ul>

In [3]:
buffer=[]
for i in range(0,14):
    buffer.append(i)
    print(buffer[i])

0
1
2
3
4
5
6
7
8
9
10
11
12
13


<h3>Arithmetic Overflow</h3>
<ul>
<li>Integer values that are too large or too small may fall outside the allowable bounds for their data type, leading to unpredictable problems that can both reduce the robustness of your code and lead to potential security problems.
<ul>
<li>Declaring a variable as type <strong>int</strong> in most programming languages allocates a fixed amount of space in memory.</li>
<li>Most languages include several integer types, including <strong>short, int, long,</strong> etc., to allow for less or more storage.</li>
<li>The amount of space allocated limits the range of values that can be stored. For example, a 32-bit signed <strong>int</strong> variable can hold values from -2<sup>31</sup> through 2<sup>31</sup>-1.</li>
</ul>
</li>
<li>Input or mathematical operations such as addition, subtraction, and multiplication may lead to values that are outside of this range.
<ul>
<li>This results in an integer error or overflow, which causes undefined behavior and the resulting value will likely not be what the programmer intended. Integer overflow is a common cause of software errors and vulnerabilities.</li>
</ul>
</li>
</ul>
<h4>Risk: How Can It Happen?</h4>
<ul>
<li>Not checking for overflow</li>
<li>Mixing integer types of different ranges</li>
<li>Mixing unsigned and signed integers</li>
</ul>
<h3>Arithmetic Overflow in Python</h3>
<ul>
<li>In<strong> Python</strong>, integers have <em>arbitrary precision</em> and therefore we can represent an arbitrarily large range of integers (only limited by memory available).
<ul>
<li>This means integers CANNOT overflow in pure Python!</li>
</ul>
</li>
</ul>

In [4]:
# arbitrary precision of integers in python
2 ** 200

1606938044258990275541962092341162602522202993782792835301376

<h3>Arithmetic Overflow in Numpy &amp; Pandas</h3>
<ul>
<li>Integers CAN overflow if the operations are done in the pydata stack (<strong>numpy/pandas</strong>), because they use C-style fixed-precision integers</li>
<li>The example below creates a numpy array which will store&nbsp; a 64-bit integer: therefore <span id="MathJax-Element-7-Frame" class="MathJax" style="position: relative;" role="presentation" data-mathml="&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;msup&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mrow class=&quot;MJX-TeXAtom-ORD&quot;&gt;&lt;mn&gt;63&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;&amp;#x2212;&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt;"><span id="MathJax-Span-43" class="math" style="width: 3.516em; display: inline-block;"><span style="display: inline-block; position: relative; width: 2.999em; height: 0px; font-size: 116%;"><span style="position: absolute; top: -2.296em; left: 0em;"><strong><span id="MathJax-Span-44" class="mrow"><span id="MathJax-Span-45" class="msubsup"><span style="display: inline-block; position: relative; width: 1.282em; height: 0px;"><span style="position: absolute; top: -3.982em; left: 0em;"><span id="MathJax-Span-46" class="mn">2</span></span><span style="position: absolute; top: -4.375em; left: 0.5em;"><span id="MathJax-Span-47" class="texatom"><span id="MathJax-Span-48" class="mrow"><span id="MathJax-Span-49" class="mn">63</span></span></span></span></span></span><span id="MathJax-Span-50" class="mo">&minus;</span><span id="MathJax-Span-51" class="mn">1</span></span></strong></span></span></span></span> is actually the largest integer it can hold.</li>
</ul>

In [5]:
import numpy as np

# Numpy and arithmetic overflowwith addition
a = np.array([2**63 - 1, 2**63 - 1])
a

array([9223372036854775807, 9223372036854775807], dtype=int64)

<h4>Creating an Arithmetic Overflow</h4>
<ul>
<li>Adding 1 to the array will <em>silently</em> cause an overflow:</li>
</ul>

In [6]:
a + 1

array([-9223372036854775808, -9223372036854775808], dtype=int64)

<ul>
<li>Summing the arrays will also generate an error</li>
</ul>

In [7]:
# Numpy and arithmetic overflow with sum
a = np.array([2**63 - 1, 2**63 - 1])
a.sum()

-2

<p>Mean DOES work because it converts the numbers to floats before doing the addition:</p>

In [8]:
# Numpy and no arithmetic overflow
a = np.array([2**63 - 1, 2**63 - 1])
a.mean()

9.223372036854776e+18

<h3>Take-away</h3>
<ul>
<li>You still needs to be careful with precision issues in Python especially when using the pydata stack (numpy/pandas).</li>
<li><em>Validate input:</em>
<ul>
<li>If you have an integer variable, verify that it is within the proper bounds before you add it to a NumPy array.</li>
<li>When doing mathematical operations in NumPy verify the size of numbers BEFORE completing the operation</li>
</ul>
</li>
<li><em>Use alternative data structures that reduce the risk of aritmetic overflows: </em>
<ul>
<li style="list-style-type: none;">
<ul>
<li>When possible, use lists in Python instead of NumPy - especially when working with numbers that may become large.</li>
</ul>
</li>
</ul>
</li>
</ul>
<hr />
<h3>References</h3>
<p><a href="https://cisserv1.towson.edu/~cssecinj/modules/cs1/buffer-overflow-cs1-python/" target="_blank" rel="noopener">https://cisserv1.towson.edu/~cssecinj/modules/cs1/buffer-overflow-cs1-python/</a>&nbsp;</p>
<p><a href="https://cisserv1.towson.edu/~cyber4all/modules/nanomodules/Integer_Error-CS2_Java.html" target="_blank" rel="noopener">https://cisserv1.towson.edu/~cyber4all/modules/nanomodules/Integer_Error-CS2_Java.html</a>&nbsp;</p>