## Assignment 22

### Q1. What are the benefits of the built-in array package, if any?

The built-in `array` package in Python provides an efficient way to store and manipulate homogeneous data, such as arrays of numeric data. Some of the benefits of the `array` package are:

1. Memory efficiency: Unlike Python lists, `array` objects are typed arrays and hence occupy less memory space. They provide a space-efficient way to store a large number of homogeneous items.

2. Fast operations: Since all items in an `array` object are of the same type, many operations such as sorting, searching, and arithmetic operations can be performed faster than with lists.

3. Interoperability: `array` objects can be easily converted to and from other data structures such as lists, numpy arrays, and memory-mapped files.

Here's an example of using the `array` package to create and manipulate an array of integers:

```python
import array

# create an array of integers
arr = array.array('i', [1, 2, 3, 4, 5])

# access individual elements
print(arr[0])   # output: 1

# append elements
arr.append(6)

# insert elements
arr.insert(3, 7)

# remove elements
arr.remove(4)

# iterate over the elements
for num in arr:
    print(num)

# convert to a list
lst = arr.tolist()
```

In this example, an `array` of integers is created using the type code `'i'`, which represents a signed integer. The array is then manipulated using various methods such as `append`, `insert`, and `remove`. Finally, the `array` is converted to a Python list using the `tolist()` method.


### Q2. What are some of the array package&#39;s limitations? 

The built-in `array` package in Python provides an efficient way to store and manipulate homogeneous data, such as arrays of numeric data. Some of the benefits of the `array` package are:

1. Memory efficiency: Unlike Python lists, `array` objects are typed arrays and hence occupy less memory space. They provide a space-efficient way to store a large number of homogeneous items.

2. Fast operations: Since all items in an `array` object are of the same type, many operations such as sorting, searching, and arithmetic operations can be performed faster than with lists.

3. Interoperability: `array` objects can be easily converted to and from other data structures such as lists, numpy arrays, and memory-mapped files.

Here's an example of using the `array` package to create and manipulate an array of integers:

```python
import array

# create an array of integers
arr = array.array('i', [1, 2, 3, 4, 5])

# access individual elements
print(arr[0])   # output: 1

# append elements
arr.append(6)

# insert elements
arr.insert(3, 7)

# remove elements
arr.remove(4)

# iterate over the elements
for num in arr:
    print(num)

# convert to a list
lst = arr.tolist()
```

In this example, an `array` of integers is created using the type code `'i'`, which represents a signed integer. The array is then manipulated using various methods such as `append`, `insert`, and `remove`. Finally, the `array` is converted to a Python list using the `tolist()` method.

### Q3. Describe the main differences between the array and numpy packages. 

The array and numpy packages in Python both provide support for working with arrays. However, there are some main differences between them:

1. Data types: The array package supports only a limited set of data types, while numpy supports a much wider range of data types. For example, the array package supports only int, float, and complex data types, while numpy supports many more, including bool, string, and datetime.

2. Functionality: Numpy provides a much wider range of functionality than the array package. For example, numpy provides support for advanced mathematical operations like linear algebra and Fourier transforms, while the array package does not.

3. Performance: Numpy is optimized for performance and can handle large arrays much more efficiently than the array package. Numpy uses contiguous blocks of memory and can take advantage of vectorized operations, which can significantly speed up computation.

Here are some examples to illustrate the differences:

1. Data types:

```python
import array
import numpy as np

# Create an array using the array package
a = array.array('i', [1, 2, 3])
print(a)

# Create an array using numpy
b = np.array([1, 2, 3], dtype=np.float64)
print(b)
```

Output:
```
array('i', [1, 2, 3])
[1. 2. 3.]
```

In this example, we create an array of integers using the array package and an array of floats using numpy.

2. Functionality:

```python
import array
import numpy as np

# Calculate the dot product of two arrays using numpy
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
dot_product = np.dot(a, b)
print(dot_product)

# Calculate the dot product of two arrays using the array package
a = array.array('i', [1, 2, 3])
b = array.array('i', [4, 5, 6])
dot_product = sum([a[i] * b[i] for i in range(len(a))])
print(dot_product)
```

Output:
```
32
32
```

In this example, we calculate the dot product of two arrays using numpy and the array package. Numpy provides a built-in function for this operation, while with the array package we need to write our own code.

3. Performance:

```python
import array
import numpy as np
import time

# Create a large array using the array package
a = array.array('i', range(1000000))

# Create a large array using numpy
b = np.arange(1000000)

# Time how long it takes to calculate the sum of the array using the array package
start_time = time.time()
sum_a = sum(a)
end_time = time.time()
print('Sum using array package:', sum_a, 'Time:', end_time - start_time)

# Time how long it takes to calculate the sum of the array using numpy
start_time = time.time()
sum_b = np.sum(b)
end_time = time.time()
print('Sum using numpy:', sum_b, 'Time:', end_time - start_time)
```

Output:
```
Sum using array package: 499999500000 Time: 0.05662178993225098
Sum using numpy: 499999500000 Time: 0.00030517578125
```

In this example, we create a large array using the array package and numpy and calculate the sum of the array. We time how long it takes to calculate the sum using each package and see that numpy is much faster.


### Q4. Explain the distinctions between the empty, ones, and zeros functions.

In NumPy, the `empty`, `ones`, and `zeros` functions are used to create arrays of a specified shape and data type.

- `empty`: This function creates an array of the specified shape, but without initializing the elements. The elements in the array will contain random data. This function is faster than the other two because it does not initialize the elements. 

Example:
```python
import numpy as np

arr_empty = np.empty((2, 3))
print(arr_empty)
```

Output:
```
array([[4.67920178e-310, 6.91251882e-310, 6.91254416e-310],
       [6.91254418e-310, 0.00000000e+000, 0.00000000e+000]])
```

- `ones`: This function creates an array of the specified shape and type, with all elements set to 1.

Example:
```python
import numpy as np

arr_ones = np.ones((2, 3))
print(arr_ones)
```

Output:
```
array([[1., 1., 1.],
       [1., 1., 1.]])
```

- `zeros`: This function creates an array of the specified shape and type, with all elements set to 0.

Example:
```python
import numpy as np

arr_zeros = np.zeros((2, 3))
print(arr_zeros)
```

Output:
```
array([[0., 0., 0.],
       [0., 0., 0.]])
```

Note that you can also specify the data type of the array using the `dtype` parameter. By default, the data type is `float64`.

### Q5. In the fromfunction function, which is used to construct new arrays, what is the role of the callable argument?

In the `numpy.fromfunction` function, the callable argument is a function that will be called with the indices of each element as arguments and should return the value to be stored at that index. It is used to construct new arrays from a given shape using a function to specify the values.

For example, let's say we want to create a 3x3 array where each element is the sum of its row and column indices. We can use the `numpy.fromfunction` function along with a lambda function that takes the row and column indices as input and returns their sum:

``` python
import numpy as np

arr = np.fromfunction(lambda i, j: i + j, (3, 3))
print(arr)
```

Output:
```
array([[0., 1., 2.],
       [1., 2., 3.],
       [2., 3., 4.]])
```

Here, the lambda function takes two arguments `i` and `j`, representing the row and column indices respectively, and returns their sum, which is stored in the corresponding element of the array.


### Q6. What happens when a numpy array is combined with a single-value operand (a scalar, such as an int or a floating-point value) through addition, as in the expression A + n? 

When a numpy array is combined with a single-value operand (a scalar) through addition, the scalar value is added element-wise to each element of the array.

For example:
```python
import numpy as np

A = np.array([1, 2, 3])
n = 2
result = A + n

print(result)
```
Output:
```
[3 4 5]
```
In the above example, each element of the numpy array `A` has been added with the scalar value 2. The resulting array `result` has the same shape as `A`, and each element in `result` is the sum of the corresponding element in `A` and the scalar `n`.

### Q7. Can array-to-scalar operations use combined operation-assign operators (such as += or *=)? What is the outcome?

Yes, array-to-scalar operations can use combined operation-assign operators such as `+=` or `*=`. 

When we use a combined operation-assign operator with a scalar, it is applied to each element of the array. The result is a new array that is the same shape as the original array, with the operation applied element-wise. 

Here are some examples:

```python
import numpy as np

# create a 2x2 array
A = np.array([[1, 2], [3, 4]])

# add a scalar to the array using +=
A += 2
print(A)
# Output:
# [[3 4]
#  [5 6]]

# multiply the array by a scalar using *=
A *= 2
print(A)
# Output:
# [[ 6  8]
#  [10 12]]
```

In the above examples, we add 2 to each element of the array using `+=`, and then multiply each element of the array by 2 using `*=`.


### Q8. Does a numpy array contain fixed-length strings? What happens if you allocate a longer string to one of these arrays?

By default, numpy arrays do not contain fixed-length strings. However, it is possible to create arrays of fixed-length strings using the `dtype` argument and the `S` specifier. For example, to create an array of 5 strings, each of length 10, you can use:

```python
import numpy as np

a = np.empty(5, dtype='S10')
```

This creates an array `a` with 5 empty strings, each of length 10.

If you try to allocate a longer string to one of these arrays, the string will be truncated to the specified length. For example:

```python
a[0] = 'this is a longer string'
print(a[0])  # prints b'this is a '
```

In this example, the string "this is a longer string" is truncated to "this is a ", which fits in the array of length 10.



### Q9. What happens when you combine two numpy arrays using an operation like addition (+) or multiplication (*)? What are the conditions for combining two numpy arrays?

When two NumPy arrays are combined using an operation like addition or multiplication, the corresponding elements in the arrays are combined using that operation to create a new array. This is called element-wise operation. 

The conditions for combining two NumPy arrays are:

1. The arrays must have the same shape or be broadcastable to the same shape.
2. The corresponding elements must be compatible with the operation being performed.

If the arrays have different shapes, NumPy will attempt to broadcast them to a common shape. Broadcasting allows NumPy to perform element-wise operations on arrays with different shapes.

Here are some examples:

Example 1: Addition of two arrays
```python
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = a + b
print(c)
```
Output:
```
array([5, 7, 9])
```

Example 2: Multiplication of two arrays
```python
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = a * b
print(c)
```
Output:
```
array([ 4, 10, 18])
```

Example 3: Broadcasting
```python
import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([10, 20, 30])
c = a + b
print(c)
```
Output:
```
array([[11, 22, 33],
       [14, 25, 36]])
```

In this example, `b` is broadcasted to the shape `(2, 3)` to match the shape of `a`. Then, element-wise addition is performed to create the new array `c`.

### Q10. What is the best way to use a Boolean array to mask another array?

In NumPy, Boolean indexing can be used to filter elements of an array. To use a Boolean array to mask another array, we can create a Boolean array of the same shape as the array to be masked, where each element of the Boolean array corresponds to whether the corresponding element in the original array should be included in the mask or not. Then, we can use this Boolean array as an index to extract the desired elements from the original array.

Here is an example:

```python
import numpy as np

# create an array to be masked
arr = np.array([1, 2, 3, 4, 5])

# create a Boolean array to use as the mask
mask = np.array([True, False, True, False, True])

# apply the mask to the array
masked_arr = arr[mask]

print(masked_arr) # output: [1 3 5]
```

In this example, we created an array `arr` containing the values `[1, 2, 3, 4, 5]`. We also created a Boolean array `mask` of the same shape as `arr`, where the `True` values indicate which elements of `arr` should be included in the mask. We then used `mask` as an index to extract the elements of `arr` where `mask` is `True`. The resulting masked array `masked_arr` contains only the elements of `arr` where the corresponding value in `mask` is `True`.

### Q11. What are three different ways to get the standard deviation of a wide collection of data using both standard Python and its packages? Sort the three of them by how quickly they execute. 

In Python, there are various ways to get the standard deviation of a wide collection of data using standard Python and its packages such as NumPy, Pandas, and statistics.

Here are three different ways to get the standard deviation of a wide collection of data:

1. Using Python's statistics module: The statistics module in Python contains a function called `stdev()` that can be used to calculate the standard deviation of a list of numbers. Here's an example:

```python
import statistics

data = [2.5, 3.5, 4.0, 4.5, 5.0]
stdev = statistics.stdev(data)
print(stdev)
```

Output:
```
0.7071067811865476
```

2. Using NumPy: NumPy is a popular package for scientific computing in Python. It contains a function called `std()` that can be used to calculate the standard deviation of an array. Here's an example:

```python
import numpy as np

data = np.array([2.5, 3.5, 4.0, 4.5, 5.0])
stdev = np.std(data)
print(stdev)
```

Output:
```
0.7071067811865476
```

3. Using Pandas: Pandas is a popular package for data manipulation and analysis in Python. It contains a function called `std()` that can be used to calculate the standard deviation of a DataFrame. Here's an example:

```python
import pandas as pd

data = pd.DataFrame({'col1': [2.5, 3.5, 4.0, 4.5, 5.0]})
stdev = data['col1'].std()
print(stdev)
```

Output:
```
0.7071067811865476
```

In terms of speed, NumPy is generally the fastest option because it is implemented in C and optimized for numerical computations. Pandas is slower than NumPy but still faster than the pure Python solution using the statistics module.


### 12. What is the dimensionality of a Boolean mask-generated array? 

The dimensionality of a Boolean mask-generated array is the same as the original array. The Boolean mask is used to select a subset of the original array based on some condition and returns a new array with the same dimensionality as the original array. The selected elements in the new array are True and the unselected elements are False.

Here's an example:

```python
import numpy as np

# create a 3x3 array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# create a Boolean mask for values greater than 5
mask = arr > 5

# apply the mask to the original array
new_arr = arr[mask]

print(new_arr)
```

Output:
```
[6 7 8 9]
```

In this example, `mask` is a Boolean array with the same dimensionality as `arr`, and `new_arr` is a 1-dimensional array with the selected elements from `arr`.
