# Day 3: Arrays and Their Operations 📚

## Table of Contents 📖

- [Objectives](#objectives)
...

## Objectives

- To be continue...

## RAM and Data Representation 💻



Program efficiency can be assessed by how it consumes different types of computer resources. The main ones are **processor time** and **RAM**. There’s also network bandwidth, disk performance, and other limited resources. ⚙️

In the first sprint, we discussed how an algorithm uses processor time. Now, we turn our attention to **memory consumption**—the **spatial complexity** of an algorithm. 📊


### 1. How RAM Works 💽


**RAM (or memory)** is a kind of “scratchpad” that programs use for computations. Data can be written, overwritten, and read from it. 💡

Modern computers have just a few gigabytes of **RAM**, and data vanishes as soon as the program terminates. Therefore, this memory is unsuitable for long-term storage, but it operates hundreds of times faster than an **SSD** and thousands of times faster than a standard hard drive. ⚡

**RAM** is divided into cells of 1 byte each. Each cell has an ordinal number, called an **address** (individual bits within a cell do not have addresses). The processor interacts directly with **RAM** using these addresses to access data, similar to how a program accesses variables by name. 🔍

To understand how much memory a program consumes, one must determine the size of individual objects. Although this size can vary between programming languages, the basic principles are common. First, let’s examine how memory is used in **C++**, as it allows for highly efficient, flexible, and predictable memory management. 👍



### 2. Representation of Basic Data Types in Memory 🔠

What can be stored in one memory cell? As mentioned, the smallest cell is **1 byte** (8 bits). Each bit can be either 0 or 1. Eight bits allow encoding of 2⁸ = 256 possible values. Thus, one cell can, for example, store an integer from 0 to 255.

If these numbers are interpreted as character codes, then **1 byte** represents 1 character (of type `char`): all Latin and Cyrillic letters, digits, and basic symbols can be encoded within these 256 possibilities.

<div style="width: 90%; border: 2px solid #4CAF50; border-radius: 10px; padding: 20px; background-color: #f0f8ff; font-family: 'Courier New', Courier, monospace;">
  <p style="font-size: 16px; line-height: 1.6;">
    Encoding tables map each bit combination to a specific character. All modern encodings are based on <u>the ASCII table</u>, which consists of 128 characters corresponding to values 0 to 127. Many encodings also provide mappings for the range 128 to 255, typically including characters from national alphabets such as Cyrillic. 🔠
  </p>
  <p style="font-size: 16px; line-height: 1.6;">
More complex encodings use two bytes or even a variable number of bytes to encode various alphabets, ideograms, emojis, and many other symbols.
  </p>
</div>








The most common integer type, **int**, occupies **4 bytes** and can represent numbers from –2,147,483,648 to 2,147,483,647. In 4 bytes (32 bits), 1 bit is used for the sign, leaving 31 bits to represent 2³¹ = 2,147,483,648 different values. Since zero is also represented, there is one fewer positive number compared to negative. If your numbers do not exceed two billion in absolute value, this type is sufficient. Note that there are roughly four billion different numbers, but half are negative. When a variable is guaranteed to be non-negative, the **unsigned int** type is preferable, storing values from 0 to 4,294,967,295. 🔢



<div style="width: 90%; border: 2px solid #4CAF50; border-radius: 10px; padding: 20px; background-color: #f0f8ff; font-family: 'Courier New', Courier, monospace;">
    <p style="font-size: 16px; line-height: 1.6;">
        Caution: If an operation results in a number outside the allowable range, <strong>overflow</strong> occurs and the result may be incorrect. For example, for the <code>int</code> type:
        <br>
        <span style="display:inline-block; margin-left: 2em;">2147483647 + 1 = −2147483648</span>  
        <br>
        <span style="display:inline-block; margin-left: 2em;">2147483647 × 3 = 2147483645</span>  
        <br>
        This issue does not occur in languages with arbitrary-length integers, such as Python. ⚠️
    </p>
</div>

### 3. Floating-Point Numbers 🔢


Floating-point (i.e., fractional) numbers are usually represented using the **double** type, which occupies **8 bytes**. They can represent both very large numbers (e.g., ±10³⁰⁸) and numbers close to zero (e.g., ±10⁻³⁰⁸).

Calculations with floating-point numbers are often imprecise and can accumulate errors, leading to unexpected behavior. For example, because numbers are stored in binary rather than decimal, **0.1 × 3 ≠ 0.3**. Therefore, comparisons involving floating-point values must account for a margin of error, such as using:  
  `abs(0.1 * 3 - 0.3) < EPS`  
where `EPS` is the acceptable error tolerance.

Using floating-point values as loop counters can also affect iteration counts due to accumulated imprecision; in such cases, integers are preferred. 🔢


### 4. Memory Addressing 📍


Sometimes a program needs to store an **address**—the location of data in RAM. A few years ago, addressing used 4 bytes (32 bits), which can address 2³² bytes (4 GB). However, as computer capacities and program demands have grown, 4 GB has become insufficient. Programmers now use **8-byte addressing**, which should be adequate for a long time. 📍




### 5. Composite Data Types 📊


Let’s calculate the memory required to store an array of 10 strings, each containing 20 characters.

1. **Text Storage:**  


$$10 \ \text{strings} \times \frac{20\,\text{symbols}}{\text{string}} \times \frac{1\,\text{byte}}{\text{symbol}} = 200\,\text{bytes}
$$

This would be sufficient if the strings were stored contiguously in memory. However, manipulating them is challenging: inserting a character into the first string might require shifting all subsequent strings.  


<style>
    .name-grid {
        display: inline-grid;
        grid-template-columns: repeat(24, 20px); /* Reduced from 30px */
        gap: 0px;
        font-family: monospace;
        margin: 5px 0;
    }
    .cell {
        width: 20px;                /* Reduced from 30px */
        height: 25px;               /* Reduced from 40px */
        display: flex;
        align-items: center;
        justify-content: center;
        border: 1px solid #ccc;
        font-size: 12px;            /* Added smaller font */
        font-weight: bold;
    }
    .letter {
        background: #4CAF50;
        color: white;
    }
    .separator {
        background: #B2DFDB;
    }
    .empty {
        background: #E0F2F1;
        opacity: 0.3;
    }
    .arrow {
        position: relative;
    }
    .arrow::after {                 /* Improved arrow visibility */
        content: "➜";              /* Changed to a more visible arrow */
        position: absolute;
        top: 2px;                   /* Adjusted position */
        left: 4px;                  /* Adjusted position */
        color: #2E7D32;            /* Darker green for better visibility */
        font-size: 14px;            /* Larger arrow */
        font-weight: bold;
    }
    .image {
        display: block;
        margin: 0 auto;
        width: 100%;
        max-width: 600px;
    }
</style>

<div class="name-grid">
    <!-- First Row -->
    <div class="cell letter">H</div>
    <div class="cell letter">O</div>
    <div class="cell letter">L</div>
    <div class="cell letter">L</div>
    <div class="cell letter">Y</div>
    <div class="cell separator arrow"></div>
    <div class="cell letter">G</div>
    <div class="cell letter">E</div>
    <div class="cell letter">O</div>
    <div class="cell letter">R</div>
    <div class="cell letter">G</div>
    <div class="cell letter">E</div>
    <div class="cell separator arrow"></div>
    <div class="cell letter">M</div>
    <div class="cell letter">I</div>
    <div class="cell letter">C</div>
    <div class="cell letter">H</div>
    <div class="cell letter">A</div>
    <div class="cell letter">E</div>
    <div class="cell letter">L</div>
    <div class="cell separator"></div>
    <div class="cell empty"></div>
    <div class="cell empty"></div>
    <div class="cell empty"></div>
</div>

<div class="name-grid">
    <!-- Second Row -->
    <div class="cell letter">H</div>
    <div class="cell letter">E</div>
    <div class="cell letter">N</div>
    <div class="cell letter">R</div>
    <div class="cell letter">Y</div>
    <div class="cell separator arrow"></div>
    <div class="cell letter">G</div>
    <div class="cell letter">R</div>
    <div class="cell letter">E</div>
    <div class="cell letter">G</div>
    <div class="cell letter">O</div>
    <div class="cell letter">R</div>
    <div class="cell letter">Y</div>
    <div class="cell separator arrow"></div>
    <div class="cell letter">M</div>
    <div class="cell letter">I</div>
    <div class="cell letter">C</div>
    <div class="cell letter">H</div>
    <div class="cell letter">E</div>
    <div class="cell letter">L</div>
    <div class="cell letter">L</div>
    <div class="cell letter">E</div>
    <div class="cell separator"></div>
    <div class="cell empty"></div>
</div>

2. **Memory Structure & Overhead**

In memory, arrays and composite structures use **pointers** (memory addresses) rather than storing objects directly. Let's analyze our example:

- **String Storage:** 200 bytes
- **Pointer Array:** 80 bytes
- **Total Memory:** 200 + 80 = 280 bytes 💾

Most programming languages add **metadata overhead** to objects:
- A small integer in Python: ~30 bytes (vs 4 bytes in low-level languages)
- String storage: ~40 bytes overhead
- Arrays: Even more overhead

Objects are typically stored **non-contiguously** in memory, with access managed through stored addresses. This structure provides flexibility but increases memory usage. 🔍


Let's calculate how much memory a Python array of 10 numbers consumes and what it is spent on:

In [6]:
import sys

display(sys.getsizeof(42))      # => 28 bytes for a small integer
display(sys.getsizeof([]))      # => 56 bytes for an empty array
display(sys.getsizeof([42]))    # => 64 = (56 + 8) bytes for array with one element
display(sys.getsizeof([1,2,3,4, # => 136 = (56 + 8*10) bytes for array 
                5,6,7,8,9,10])) # with ten elements
                                # the data itself is stored separately
                                # and adds 280 = (28 * 10) bytes  
                               

28

56

64

136

As a result, we spend 56 bytes to create an array, and then 8 bytes for each new object in the array - these are element addresses. But that's not all. Each number also needs 28 bytes. In total, an array of ten numbers in Python takes up `56 + (8 + 28) * 10 bytes = 416 bytes`. Compare this with 40 bytes in C++. **Python consumes ten times more memory!** 🤯

### 6. Memory Deallocation 🗑️

It's crucial to understand that when you stop using an object, it doesn't automatically mean the memory is freed. In C++, you need to manage memory deallocation manually. Built-in containers like `std::vector` simplify this task: they automatically allocate necessary memory when creating the container and free it when the container variable goes out of scope. 💾

Most other programming languages use a garbage collector to handle memory deallocation. The garbage collector periodically checks if objects can still be used. If it determines that no variables or other objects reference an object, that object is destroyed. 🗑️

However, garbage collection is a resource-intensive operation, so some languages delay it until free space becomes scarce. For example, Java programs (unless specifically constrained at launch) often face high memory consumption due to storing many old and unused objects. ⚠️

## Fixed-Size Arrays 📏

### 1. A Deep Dive 🔍


In previous lessons, we discussed how simple and composite data types can be stored in computer **RAM**. Now we'll explore various **data structures**. A **data structure** is a way of organizing information in memory that enables specific operations, such as quick data searching or modification.

An **array** is one of the fundamental data structures. In this lesson, we'll examine the capabilities of different array types and their implementations. 📚

The simplest type of array has a **fixed size** and can store elements of the same type. For example, if we create an array of ten integers, we cannot add another element or store an object of an incompatible type. 

You'll find such arrays in:  
- **C:**
    ```c
     int numbers[10]
- **C++:** 
    ```cpp
     std::array<int, 10> numbers

**Fixed-size arrays** support only two operations:
- Retrieve an element's value by index
- Overwrite a value at an index 🔄

While they may not be highly functional, they are extremely efficient. Each operation executes in **O(1) time complexity**. ⚡

To understand how arrays achieve such fast element access, let's examine their mechanics and see how they store data in memory. 🧐

In previous lessons, we discussed how simple and composite data types can be stored in computer **RAM**. Now we'll explore various **data structures**. A **data structure** is a way of organizing information in memory that enables specific operations, such as quick data searching or modification.

An **array** is one of the fundamental data structures. In this lesson, we'll examine the capabilities of different array types and their implementations. 📚

The simplest type of array has a **fixed size** and can store elements of the same type. For example, if we create an array of ten integers, we cannot add another element or store an object of an incompatible type. 

You'll find such arrays in:  
- **C:**
    ```c
     int numbers[10]
- **C++:** 
    ```cpp
     std::array<int, 10> numbers

**Fixed-size arrays** support only two operations:
- Retrieve an element's value by index
- Overwrite a value at an index 🔄

While they may not be highly functional, they are extremely efficient. Each operation executes in **O(1) time complexity**. ⚡

To understand how arrays achieve such fast element access, let's examine their mechanics and see how they store data in memory. 🧐



### 2. How Arrays Work 🔍



An array is a collection of data elements of the same type, stored sequentially in memory. It always occupies a continuous block of memory. The operating system communicates the exact location of this memory block to the program when the array is created. 🎯

The zero element is located at the beginning of the allocated memory block, immediately followed by the first element, and so on in order. They are placed next to each other without gaps. By knowing an element's position in memory—its address—we can read or write that element. Let's explore how to determine element addresses. 📍

Consider an array `numbers` containing 10 unsigned integers. We know that the address of the zero element is 1000. To find the position of the next element, we add the size of one element (in bytes) to the starting address. Since each number occupies 4 bytes, the first (zero) element will occupy bytes 1000, 1001, 1002, and 1003. Therefore, the element with index 1 will be stored at address 1004. 🔢

<!DOCTYPE html>
<html>
<head>
<style>
.memory-table {
    border-collapse: collapse;
    width: 50%;
    max-width: 700px;
    font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Arial, sans-serif;
    background: #E0F2F1;
}

.memory-table td, .memory-table th {
    border: 1px solid #B2DFDB;
    padding: 8px;
    text-align: center;
}

.memory-table th {
    background: #E0F2F1;
    font-weight: normal;
    text-align: left;
    padding: 12px;
}

.calculation {
    font-size: 1em;
}

.green {
    color: #4CAF50;
}

.separator {
    border-left: 2px solid #81C784;
}

.header-cell {
    background: #E0F2F1;
    text-align: left !important;
    font-weight: normal;
    padding: 12px !important;
}
    .image {
        display: block;
        margin-left: auto;
        margin-right: auto;
        width: 50%;
    }

</style>
</head>
<body>
<table class="memory-table">
    <tr>
        <th colspan="4" class="header-cell calculation">
            20 <span class="green">+ 2 × 256</span> = 532
        </th>
        <th colspan="4" class="header-cell separator calculation">
            192
        </th>
        <th colspan="4" class="header-cell separator calculation">
            4 <span class="green">+ 1 × 256²</span> = 65540
        </th>
        <th class="header-cell separator">
            ...
        </th>
        <th class="header-cell separator">
            Final Numbers Values
        </th>
    </tr>
    <tr>
        <td>20</td>
        <td>2</td>
        <td>0</td>
        <td>0</td>
        <td class="separator">192</td>
        <td>0</td>
        <td>0</td>
        <td>0</td>
        <td class="separator">4</td>
        <td>0</td>
        <td>1</td>
        <td>0</td>
        <td class="separator">...</td>
        <td class="separator">Individual Byte Values</td>
    </tr>
    <tr>
        <td>1000</td>
        <td>1001</td>
        <td>1002</td>
        <td>1003</td>
        <td class="separator">1004</td>
        <td>1005</td>
        <td>1006</td>
        <td>1007</td>
        <td class="separator">1008</td>
        <td>1009</td>
        <td>1010</td>
        <td>1011</td>
        <td class="separator">...</td>
        <td class="separator">Addresses</td>
    </tr>
</table>
</body>
</html>

<!-- Image with 50% size -->
![Memory Layout](https://hcti.io/v1/image/752382d3-30cc-404f-a26d-3c941fa138f6)

> **Question**: What address will the tenth element of the array `numbers[9]` be located at?
> 
> - $1009$
> - $1010$
> - $1036$
> - $1040$

The **correct answer** is $1036$. The first element is at address $1000$, and each element occupies 4 bytes. Therefore, the tenth element will be at $1000 + 9 \times 4 = 1009$.

> **[Rita]:** Why can't we store elements of different types in an array?
>
> **[Timothy]:** Because then the cells would be different sizes, and you wouldn't be able to calculate the address of an element using simple arithmetic based on its index. You'd need to know how many cells of each size are at indices 0 through 8. And that's impossible if the cell sizes are different.
> 
> **[Rita]:** But in Python and JavaScript, you can store elements of different types in an array. How do they manage that?
>
> **[Timothy]:** The thing is, such arrays don't store the objects themselves, but only pointers to them. A pointer, or address, always takes up 8 bytes, regardless of the size of the object it points to. Remember how we stored strings in an array.

<!-- Image with 50% size -->
![Memory Layout](https://hcti.io/v1/image/eb16e8d1-8193-4288-965b-035a6000dce3)

*The array of names `["HOLLY", "HENRY", "GREG"]` stores not the strings themselves, but pointers to them. The strings are located in other memory segments. Note that the string **"GREGORY"**, although stored in memory, is not contained in the names array. However, this name can be referenced by another array or even multiple arrays.*

### 3. How to Save Arrays to Files 💾



When a program needs to save data to a file, an important question arises: how to write information so it can be unambiguously read later. If we're writing and reading data of known size, no serious problems arise. 🔄

However, sometimes the data volume varies from task to task. For example, when a program needs to write and read an array. For storing arrays in files, the following format is commonly used: first, we write the array length N, followed by N data blocks containing the array values. 📊

This encoding allows compact storage of multiple arrays (of fixed length) one after another. For example, the sequence `[3, 'a', 'b', 'c', 2, 'y', 'z', ...]` encodes two arrays. First comes an array of length 3 containing the first three letters of the alphabet. Then comes an array of length 2 containing the last two letters. Additional data may follow. 🔡

This idea is also frequently used in binary data formats. For instance, data in PNG files is divided into several blocks, with each block's length specified at its beginning. This makes it possible to determine where one block ends and another begins. 🎨

## Complexity of Insertion and Deletion in Dynamic Arrays 📊


## Reflection Questions


To be continue...

## Additional Resources

- [Introduction to Algorithms (CLRS)](https://mitpress.mit.edu/books/introduction-to-algorithms) 📘
- [MIT OpenCourseWare - Introduction to Algorithms](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/) 🎓
- [GeeksforGeeks - Analysis of Algorithms](https://www.geeksforgeeks.org/analysis-of-algorithms-set-1-asymptotic-analysis/) 💻
- [Python `timeit` Documentation](https://docs.python.org/3/library/timeit.html) 📄
- [Understanding Spatial Complexity](https://www.geeksforgeeks.org/spatial-complexity-of-algorithms/) 🌟
- [cProfile and Profiling Techniques](https://docs.python.org/3/library/profile.html) 🔍