In [9]:
!pip install numpy
import numpy as np
import time



In [20]:
import time
import numpy as np

# --- Pure Python Lists ---
size = 10_000_000 # Ten million elements

print("--- Pure Python List Operations ---")
list_a = list(range(size))
list_b = list(range(size))

start_time = time.time()
# Element-wise addition
result_list = [list_a[i] + list_b[i] for i in range(size)]
end_time = time.time()
print(f"Time taken for Python list addition: {end_time - start_time:.4f} seconds")
time_python = end_time-start_time

# --- NumPy Arrays ---
print("\n--- NumPy Array Operations ---")
numpy_a = np.arange(size) # More efficient way to create NumPy arrays
numpy_b = np.arange(size)

start_time = time.time()
# Element-wise addition (NumPy is optimized for this)
result_numpy = numpy_a + numpy_b
end_time = time.time()
print(f"Time taken for NumPy array addition: {end_time - start_time:.4f} seconds")
time_numpy = end_time-start_time
# Optional: Verify a small part of the results to show they're doing the same thing
# print(f"\nFirst 5 elements of Python result: {result_list[:5]}")
# print(f"First 5 elements of NumPy result: {result_numpy[:5]}")

--- Pure Python List Operations ---
Time taken for Python list addition: 4.7739 seconds

--- NumPy Array Operations ---
Time taken for NumPy array addition: 0.1061 seconds


In [21]:
time_python /time_numpy

45.00358937706215

In [None]:
'''
You're asking for a demonstration of NumPy's efficiency over standard Python lists for an operation like element-wise assignment (a = b isn't directly element-wise, it's a reference or copy; we'll focus on an element-wise operation like a + b or a * 2 to show the efficiency difference).

Let's illustrate with a common operation: adding two large lists/arrays element-wise.

Python

import time
import numpy as np

# --- Pure Python Lists ---
size = 10_000_000 # Ten million elements

print("--- Pure Python List Operations ---")
list_a = list(range(size))
list_b = list(range(size))

start_time = time.time()
# Element-wise addition
result_list = [list_a[i] + list_b[i] for i in range(size)]
end_time = time.time()
print(f"Time taken for Python list addition: {end_time - start_time:.4f} seconds")


# --- NumPy Arrays ---
print("\n--- NumPy Array Operations ---")
numpy_a = np.arange(size) # More efficient way to create NumPy arrays
numpy_b = np.arange(size)

start_time = time.time()
# Element-wise addition (NumPy is optimized for this)
result_numpy = numpy_a + numpy_b
end_time = time.time()
print(f"Time taken for NumPy array addition: {end_time - start_time:.4f} seconds")

# Optional: Verify a small part of the results to show they're doing the same thing
# print(f"\nFirst 5 elements of Python result: {result_list[:5]}")
# print(f"First 5 elements of NumPy result: {result_numpy[:5]}")
Explanation of Efficiency:

When you run the code, you will observe that the NumPy array addition is significantly faster (often by a factor of 10x to 100x or more) than the equivalent operation on standard Python lists.

Why NumPy is More Efficient:

C Implementation: NumPy's core operations are implemented in highly optimized C code. Python lists, on the other hand, involve more overhead at the Python object level for each element.
Contiguous Memory: NumPy arrays store elements of the same data type in contiguous blocks of memory. This allows for efficient caching and vectorized operations (applying an operation to an entire array at once) without the need for explicit Python loops.
No Type Checking Overhead: Standard Python lists can hold objects of different types, which requires Python to perform type checking for every operation. NumPy arrays are typically homogeneous (all elements of the same type), avoiding this overhead.
Vectorization: NumPy operations like + are "vectorized," meaning they operate on the entire array at once, delegating the heavy lifting to optimized low-level code. In contrast, Python list operations in a for loop (like the list comprehension [list_a[i] + list_b[i] for i in range(size)]) involve Python's interpreter for each individual element, leading to more overhead.
In essence, for numerical computations on large datasets, NumPy leverages underlying C optimizations and efficient memory management to dramatically outperform pure Python lists.
'''

## 1. np.array


In [None]:
## NumPy Arrays: A Brief Intro
'''
NumPy (Numerical Python) is the fundamental package for numerical computation in Python. The most important object in NumPy is the ndarray (N-dimensional array) type.

np.array(): Creating a NumPy Array
The np.array() function is the primary way to create a NumPy array. It takes a Python list, tuple, or other sequence-like object as input and converts it into an ndarray.

Example:

Python

import numpy as np

# Create a 1D array from a list
arr1d = np.array([1, 2, 3, 4, 5])
print("1D Array:", arr1d)

# Create a 2D array from a list of lists
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D Array:\n", arr2d)
type(): Checking the Object Type
The built-in Python type() function tells you the type of any Python object. When you create a NumPy array, type() will confirm it's a numpy.ndarray.

Example:

Python

import numpy as np

arr = np.array([1, 2, 3])
print("Type of arr:", type(arr))
# Output: Type of arr: <class 'numpy.ndarray'>
dtype: Data Type of Array Elements
While type() tells you the type of the array object itself, dtype is an attribute of the NumPy array that specifies the data type of the elements stored within the array. NumPy arrays are homogeneous, meaning all elements in a single array must be of the same data type. This homogeneity is crucial for NumPy's performance.

NumPy automatically infers the dtype when you create an array, but you can also explicitly specify it.

Common dtypes:

int64, int32: Integers (64-bit or 32-bit, depending on system)
float64, float32: Floating-point numbers
bool: Booleans (True/False)
str: Strings
object: For mixed types (less efficient)
Examples:

Python

import numpy as np

# Default dtype (inferred)
arr_int = np.array([1, 2, 3])
print("arr_int dtype:", arr_int.dtype)
# Output: arr_int dtype: int64 (or int32 on some systems)

arr_float = np.array([1.0, 2.5, 3.0])
print("arr_float dtype:", arr_float.dtype)
# Output: arr_float dtype: float64

# Explicitly specifying dtype
arr_float32 = np.array([1, 2, 3], dtype=np.float32)
print("arr_float32:", arr_float32, "dtype:", arr_float32.dtype)
# Output: arr_float32: [1. 2. 3.] dtype: float32

arr_bool = np.array([True, False, True])
print("arr_bool dtype:", arr_bool.dtype)
# Output: arr_bool dtype: bool
Understanding np.array(), type(), and dtype is fundamental to working effectively with NumPy for numerical data manipulation and computation.


In [None]:
'''

what is ndarray meant?


<\ctrlk>ndarray in NumPy stands for N-dimensional array.

Let's break down what that means:

N-dimensional: This refers to the number of axes (or dimensions) the array has.

A 1-dimensional array (like a simple list of numbers) has one axis.
A 2-dimensional array (like a table or matrix) has two axes (rows and columns).
A 3-dimensional array (like a cube of numbers) has three axes.
NumPy arrays can have any number of dimensions, hence "N-dimensional".
Array: This refers to a grid of values, all of the same type, indexed by a tuple of non-negative integers.

So, an ndarray is fundamentally a powerful and efficient container for large datasets of homogeneous (same type) items. It's designed to perform mathematical operations on entire arrays (or parts of arrays) very quickly, which is why it's the core of scientific computing in Python.'
''''''

## 2. np.arange 

In [None]:
'''np.arange() is a NumPy function that creates an array with evenly spaced values within a given interval. It's very similar to Python's built-in range() function, but it returns a numpy.ndarray instead of a range object (an iterator), and it can handle floating-point numbers.

Think of it as: "Array range."

Key Parameters (similar to range()):

start (optional): The starting value of the interval. The default is 0.
stop (required): The end value of the interval. The generated array will not include this value.
step (optional): The spacing between values. The default is 1.
Brief Examples:

Python

import numpy as np

# 1. Basic usage: From 0 up to (but not including) 5, step 1
arr1 = np.arange(5)
print("np.arange(5):", arr1)
# Output: [0 1 2 3 4]

# 2. Specifying start and stop: From 2 up to (not including) 8, step 1
arr2 = np.arange(2, 8)
print("np.arange(2, 8):", arr2)
# Output: [2 3 4 5 6 7]

# 3. Specifying start, stop, and step: From 1 up to (not including) 10, step 2
arr3 = np.arange(1, 10, 2)
print("np.arange(1, 10, 2):", arr3)
# Output: [1 3 5 7 9]

# 4. Using floating-point numbers (a key advantage over Python's range)
arr4 = np.arange(0.5, 3.0, 0.5)
print("np.arange(0.5, 3.0, 0.5):", arr4)
# Output: [0.5 1.  1.5 2.  2.5]
In summary: np.arange() is a convenient and efficient way to quickly generate sequences of numbers as NumPy arrays, especially useful for creating indices, time series, or numerical sequences with specific steps, including decimal steps.'''

## 3. np.linspace

In [None]:

'''np.linspace() is another powerful NumPy function used to create arrays with evenly spaced values, but it differs from np.arange() in a crucial way: instead of specifying a step, you specify the number of elements you want in the interval.

Think of it as: "Linear space" or "Linearly spaced numbers."

Key Parameters:

start (required): The starting value of the sequence.
stop (required): The end value of the sequence. By default, this value is included in the array (unlike np.arange()).
num (optional): The number of samples to generate. The default is 50.
endpoint (optional): If True (default), stop is the last sample. If False, stop is not included.
retstep (optional): If True, returns (samples, step), where step is the size of the spacing between samples.
When to use np.linspace() vs. np.arange():

Use np.linspace() when you know the start, end, and how many points you want in between (e.g., for plotting, creating a fixed number of data points for a function).
Use np.arange() when you know the start, end, and the step size you want between points (e.g., for creating sequences with regular increments like 0, 5, 10, ...).
Brief Examples:
'''

In [None]:
import numpy as np

# 1. Basic usage: 50 points evenly spaced between 0 and 1 (inclusive)
arr1 = np.linspace(0, 1)
print("np.linspace(0, 1) (default 50 points):\n", arr1)
# Output: [0.         0.02040816 0.04081633 ... 0.95918367 0.97959184 1.        ] (50 points)

# 2. Specifying the number of points: 5 points between 0 and 10 (inclusive)
arr2 = np.linspace(0, 10, num=5)
print("\nnp.linspace(0, 10, num=5):\n", arr2)
# Output: [ 0.   2.5  5.   7.5 10. ]

# 3. Excluding the endpoint: 5 points between 0 and 10 (10 not included)
arr3 = np.linspace(0, 10, num=5, endpoint=False)
print("\nnp.linspace(0, 10, num=5, endpoint=False):\n", arr3)
# Output: [0. 2. 4. 6. 8.]

# 4. Returning the step size
arr4, step = np.linspace(0, 10, num=5, retstep=True)
print("\nnp.linspace(0, 10, num=5, retstep=True):")
print("Array:", arr4)
print("Step size:", step)
# Output:
# Array: [ 0.   2.5  5.   7.5 10. ]
# Step size: 2.5

In [None]:
'''In data science and numerical computing, np.linspace() is incredibly useful for tasks like:

Generating values for plotting functions (e.g., x = np.linspace(-np.pi, np.pi, 100) for a sine wave).
Creating a fixed number of evenly distributed points for sampling or analysis.
Defining ranges for simulations or numerical methods.
'''

## 4. np.zeros

In [None]:
'''
np.zeros() is a NumPy function used to create a new array of a given shape and dtype, filled entirely with zeros.

It's extremely useful for:

Initializing arrays before populating them with calculated data.
Creating placeholders.
Operations that require a starting array of zeros (e.g., in linear algebra).
Syntax:

Python

numpy.zeros(shape, dtype=float, order='C')
Parameters:

shape (required): This can be an integer (for a 1D array) or a tuple of integers (for multi-dimensional arrays) specifying the dimensions of the array.
dtype (optional): The desired data type for the array elements. If not specified, it defaults to float64 (a 64-bit floating-point number).
order (optional): 'C' for C-style row-major order (default), 'F' for Fortran-style column-major order. This affects how the array is stored in memory, which can be relevant for performance in advanced use cases.
Brief Examples:'''

In [None]:
import numpy as np

# 1. Create a 1D array of 5 zeros (default float dtype)
arr1d_zeros = np.zeros(5)
print("1D array of zeros:\n", arr1d_zeros)
# Output: [0. 0. 0. 0. 0.]

# 2. Create a 2D array (3 rows, 4 columns) of zeros (default float dtype)
arr2d_zeros = np.zeros((3, 4))
print("\n2D array of zeros:\n", arr2d_zeros)
# Output:
# [[0. 0. 0. 0.]
#  [0. 0. 0. 0.]
#  [0. 0. 0. 0.]]

# 3. Create a 3D array (2 "pages", 3 rows, 2 columns) of zeros
arr3d_zeros = np.zeros((2, 3, 2))
print("\n3D array of zeros:\n", arr3d_zeros)
# Output:
# [[[0. 0.]
#   [0. 0.]
#   [0. 0.]]
#
#  [[0. 0.]
#   [0. 0.]
#   [0. 0.]]]

# 4. Specify a different dtype (e.g., integer zeros)
arr_int_zeros = np.zeros(3, dtype=int)
print("\nInteger zeros:\n", arr_int_zeros)
# Output: [0 0 0]

# 5. Specify a boolean dtype (False is the "zero" equivalent)
arr_bool_zeros = np.zeros((2, 2), dtype=bool)
print("\nBoolean zeros:\n", arr_bool_zeros)
# Output:
# [[False False]
#  [False False]]