# NumPy Module (Complete)

This notebook combines three parts into one teaching module:

1. **NumPy Basics — Arrays and Vectorized Computations**
2. **NumPy Advanced — Matrix operations, broadcasting patterns, performance**
3. **NumPy Case Study — Social network analysis (NumPy practice)**

_Source notebooks merged: `NumPy-Basics-complete.ipynb`, `NumPy-Advanced-template.ipynb`, `Numpy-Case-Study-complete.ipynb`._


## How to use this notebook

**Recommended environment**
- Python **3.10+**
- NumPy **1.23+** (any recent version is fine)

**Install (pick one)**
- `pip install numpy`
- `conda install numpy`

**Rule for sanity:** one project = one environment (venv/conda).

## Learning goals (2-week NumPy module)

By the end, students should be able to:
- Explain what an `ndarray` is (shape, dtype, views vs copies)
- Replace most Python loops with **vectorized** NumPy operations
- Use **broadcasting** for feature scaling and efficient computations
- Handle missing values (`NaN`) and apply boolean masks for cleaning/filtering
- Implement core ML math in NumPy (dot products, normalization, simple regression intuition)
- Apply NumPy to a realistic case study (network analysis)

Throughout, we’ll connect each concept to common data-science tasks.

## Quick navigation

- **Part 1:** NumPy Basics — arrays, indexing, vectorization
- **Part 2:** NumPy Advanced — broadcasting, linear algebra, performance ideas
- **Part 3:** Case Study — social network analysis with matrices

Tip: use the notebook outline / table of contents in VS Code (or Jupyter) to jump between headings.

---

# Part 1 — NumPy Basics


# NumPy - Arrays and Vectorized Computations
Numpy is the most basic and a powerful package for scientific computing and data manipulation in python. 

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
- <a href="https://www.numpy.org/devdocs/user/quickstart.html">online NumPy tutorial</a>
- <a href="https://docs.scipy.org/doc/numpy-1.13.0/reference/index.html"> NumPy reference </a>

## Basics

import NumPy package under name np

In [1]:
import numpy as np

create `ndarray`, N-dimensional array object, the key feature of NumPy, using `numpy.array()`

In [2]:
# create ndarray from list
x = [1, 2, 3, 4]
arr1 = np.array(x)

In [3]:
type(arr1)

numpy.ndarray

In [4]:
type(x)

list

array creation functions: `arange`, `ones`, `zeros`, `empty`, `eye`

In [7]:
# return ndarray instead of list like built-in range() function
np.arange(2, 10, 2)

array([2, 4, 6, 8])

In [6]:
range(5)

range(0, 5)

In [8]:
np.ones((2, 3))


array([[1., 1., 1.],
       [1., 1., 1.]])

In [9]:
np.zeros((4,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

check array properties `.ndim`, `.shape`, `.dtype`

### Axes in NumPy (the #1 confusion point)

For a 2D array `X` with shape `(n_rows, n_cols)`:

- `axis=0` aggregates **down the rows** → result has length `n_cols` (per column / per feature)
- `axis=1` aggregates **across columns** → result has length `n_rows` (per row / per sample)

This shows up everywhere in data science (e.g., mean per feature for scaling).

In [None]:
import numpy as np

X_demo = np.array([[1, 2, 3, 4],
                   [10, 20, 30, 40],
                   [100, 200, 300, 400]])

print("X_demo shape:", X_demo.shape)
print("\nSum axis=0 (per column):", X_demo.sum(axis=0))
print("Sum axis=1 (per row):", X_demo.sum(axis=1))

In [10]:
x = np.ones((3, 4))


In [11]:

x.shape

(3, 4)

In [12]:
x.dtype

dtype('float64')

In [13]:
x.ndim

2

Change data types for ndarrays

In [14]:
x = np.array([1.3, 2.4, 3.5])

In [15]:
x.dtype

dtype('float64')

In [16]:
y = np.array(x, dtype='int')

In [17]:
y.dtype

dtype('int32')

In [None]:
# change an array of strings to numeric for


## Operations between arrays

### Two arrays of the same dimensions and shape
Any arithmetic operations between equal-size arrays applies the operation elementwise 

In [22]:
x = np.ones((4, 5))
x

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [24]:
y = np.arange(20).reshape((4, 5))
y

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [25]:
x + y

array([[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.],
       [11., 12., 13., 14., 15.],
       [16., 17., 18., 19., 20.]])

In [26]:
x * y

array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.],
       [15., 16., 17., 18., 19.]])

### Array and Scalar
Arithmetic operations with scalars propagate the value to each element

In [27]:
y

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [28]:
y + 1

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

### Basic Indexing and Slicing

In [38]:
y

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [39]:
y[0, 0]

0

In [40]:
y[3, 4]

19

In [41]:
y[-1, -1]

19

In [42]:
y[0, :]

array([0, 1, 2, 3, 4])

In [43]:
y[:, -1]

array([ 4,  9, 14, 19])

In [44]:
y[0:2, 0:2]

array([[0, 1],
       [5, 6]])

Different from `list`, array slices are views on the original array. Any modification to the view will be reflected in the source array 

#### Views vs copies (important!)

Slicing usually returns a **view** (no data copied). Modifying the slice can modify the original array.
Use `.copy()` when you need an independent array.

In [None]:
a = np.arange(10)
b = a[2:6]          # view
b[:] = -1
print("a after modifying slice b:", a)

a = np.arange(10)
b = a[2:6].copy()   # copy
b[:] = -1
print("a after modifying copied slice b:", a)

In [45]:
y

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [47]:
y[0:2, :] = 0

In [48]:
y

array([[ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

More about view

To avoide change to the original array, make a copy. Any change to the copy won't affect the original array.

In [49]:
x = y.copy()

In [52]:
x[0:2, :] = 1

In [53]:
x

array([[ 1,  1,  1,  1,  1],
       [ 1,  1,  1,  1,  1],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [54]:
y

array([[ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

### Boolean Indexing

In [59]:
x = x - 12
x

array([[-11, -11, -11, -11, -11],
       [-11, -11, -11, -11, -11],
       [ -2,  -1,   0,   1,   2],
       [  3,   4,   5,   6,   7]])

Set all negative elements in z to 0

In [60]:
x < 0

array([[ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True, False, False, False],
       [False, False, False, False, False]])

In [62]:
x[x<0] = 0

In [63]:
x

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 1, 2],
       [3, 4, 5, 6, 7]])

### Fancy Indexing Using Integer Arrays

Select a subset of rows

In [70]:
x

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 1, 2],
       [3, 4, 5, 6, 7]])

In [71]:
x[[0, 3], :]

array([[0, 0, 0, 0, 0],
       [3, 4, 5, 6, 7]])

Select a subset of rows and columns

In [72]:
x[[0, 3], :][:, [0, -1]]

array([[0, 0],
       [3, 7]])

Select a subset of elements at any positions

In [73]:
x[[0, -1], [0, -1]]

array([0, 7])

### How to represent missing values and infinite?
Missing values can be represented using `np.nan` object, while `np.inf` represents infinite.

### Practical missing-value handling (NaN)

In real datasets, missing values are common. NumPy provides `np.nan*` functions that **ignore NaNs**.

Common tools:
- `np.isnan(X)` → boolean mask of missing entries
- `np.nanmean(X, axis=...)`, `np.nanstd(...)` → aggregations that skip NaNs
- simple imputation: replace NaNs with column means (demo below)

In [None]:
X = np.array([[1.0, 2.0, np.nan],
              [2.0, np.nan, 3.0],
              [np.nan, 4.0, 6.0]])

col_means = np.nanmean(X, axis=0)
inds = np.where(np.isnan(X))
X_imputed = X.copy()
X_imputed[inds] = np.take(col_means, inds[1])

print("Column means:", col_means)
print("\nImputed X:\n", X_imputed)

In [75]:
x = np.ones((3, 4))
x

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [77]:
x[0, 0] = np.nan
x

array([[nan,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.]])

In [78]:
x + 1

array([[nan,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]])

### Universal Functions `ufuncs`: Fast Element-wise Array Functions

Reference: https://docs.scipy.org/doc/numpy/reference/ufuncs.html

In [81]:
z = np.random.randint(-3, 3, (4, 3))
z

array([[-1, -2, -1],
       [ 0,  0, -1],
       [ 2, -1,  1],
       [ 1, -2,  1]])

Unary unfuncs, e.g. `abs()`, `sqrt()` that take one argument and perform elementwise transformations 

In [84]:
x = np.abs(z)
np.sqrt(x)

array([[1.        , 1.41421356, 1.        ],
       [0.        , 0.        , 1.        ],
       [1.41421356, 1.        , 1.        ],
       [1.        , 1.41421356, 1.        ]])

Binary unfuncs, e.g. 'subtract()' that take two arguments

In [85]:
np.subtract(x, np.ones((4, 3)))

array([[ 0.,  1.,  0.],
       [-1., -1.,  0.],
       [ 1.,  0.,  0.],
       [ 0.,  1.,  0.]])

In [86]:
x - np.ones((4, 3))

array([[ 0.,  1.,  0.],
       [-1., -1.,  0.],
       [ 1.,  0.,  0.],
       [ 0.,  1.,  0.]])

### User `lambda` function to perform complex element-wise array operations

In [90]:
f = lambda e: (e-2)/2

In [92]:
f(x)

array([[-0.5,  0. , -0.5],
       [-1. , -1. , -0.5],
       [ 0. , -0.5, -0.5],
       [-0.5,  0. , -0.5]])

In [93]:
x

array([[1, 2, 1],
       [0, 0, 1],
       [2, 1, 1],
       [1, 2, 1]])

### Transposing Arrays and Swapping Axes

In [95]:
arr1d = np.arange(10)
arr1d

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [97]:
arr2d = arr1d.reshape((2, 5))

In [98]:
arr2d

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [99]:
arr2d.T

array([[0, 5],
       [1, 6],
       [2, 7],
       [3, 8],
       [4, 9]])

### Reshape a multi-dimensional array as one dimension
- `ravel()` the new array created using ravel is actually a reference to the parent array
- `flatten()` changing the flattened array does not change parent

In [110]:
arr2d

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [111]:
arr2d.ravel()

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [112]:
arr2d.flatten()

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

### Statistic
NumPy statstics include `median`, `average`, `mean`, `amax`, `amin` and many others

<a href="https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.statistics.html"> Reference </a>

In [113]:
x = np.random.randint(1 , 10, (4, 5))
x

array([[3, 6, 7, 7, 9],
       [5, 5, 7, 5, 1],
       [5, 6, 5, 8, 9],
       [1, 4, 3, 3, 6]])

In [126]:
np.median(x, axis=0)

array([4. , 5.5, 6. , 6. , 7.5])

In [127]:
np.median(x, axis=1)

array([7., 5., 6., 3.])

In [129]:
np.mean(x, axis=0)

array([3.5 , 5.25, 5.5 , 5.75, 6.25])

In [131]:
np.mean(x, axis=1)

array([6.4, 4.6, 6.6, 3.4])

In [134]:
np.amin(x, axis=0)

array([1, 4, 3, 3, 1])

### Exercises

Create a random integer array of shape (4, 3) with value in [0, 10]

Calculate the **mean for each column** (i.e., per feature).

Subtract the mean of each column of the matrix

Change elements that are less than 5, to 0

---

# Part 2 — NumPy Advanced


## NumPy - Advanced

### Matrix product
Unlike some languages (e.g., MATLAB), multiplying two 2D arrays with `*` performs an **element-wise** product. For a matrix product, use `@` or `np.dot()`.

In [None]:
import numpy as np

In [None]:
x = np.ones((3, 3))*3
x

In [None]:
y = np.ones((3, 3))*2
y

Compute product of x and y

### Linear algebra (`numpy.linalg`)
NumPy includes standard linear-algebra routines in `numpy.linalg`.

Common ones:
- `diag` — diagonal elements / create diagonal matrix
- `dot` / `@` — matrix multiplication
- `trace` — sum of diagonal elements
- `det` — determinant
- `eig` — eigenvalues/eigenvectors
- `inv` — matrix inverse (use with care)
- `solve` — solve linear systems (often preferable to `inv`)

In [None]:
from numpy import linalg

In [None]:
z = np.random.randint(1, 10, (3, 3))
z

Compute the **inverse** of `z` (note: in practice, prefer `np.linalg.solve` when possible).

### Concatenating and Splitting Array
`numpy.concatenate` takes a sequence of arrays and joins them together in order along the input axis

In [None]:
x

In [None]:
y

concatenate x and y horizontally

concatenate x and y vertically

### Broadcasting

#### General Broadcasting Rules
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when

- they are equal, or
- one of them is 1

In the following example, both the A and B arrays have axes with length one that are expanded to a larger size during the broadcast operation:

<pre>
A      (2d array):  4 x 3
B      (1d array):      1
Result (2d array):  4 x 3
</pre>

<pre>
A      (2d array):  5 x 4
B      (1d array):      4
Result (2d array):  5 x 4
</pre>

<pre>
A      (3d array):  2 x 3 x 5
B      (2d array):      3 x 5
Result (3d array):  2 x 3 x 5
</pre>

<pre>
A      (3d array):  2 x 3 x 5
B      (3d array):  2 x 1 x 5
Result (3d array):  2 x 3 x 5
</pre>

<pre>
A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (4d array):  8 x 7 x 6 x 5
</pre>

#### Data science example: feature scaling with broadcasting

A common preprocessing step is **standardization**:
\[
X_{scaled} = \frac{X - \mu}{\sigma}
\]
where \(\mu\) and \(\sigma\) are the per-feature mean and standard deviation.

Broadcasting makes this a one-liner (no loops).

In [None]:
X = np.random.RandomState(0).randn(6, 3) * 10 + 50  # pretend features
mu = X.mean(axis=0)   # per feature
sigma = X.std(axis=0)

X_scaled = (X - mu) / sigma

print("Means (should be ~0):", X_scaled.mean(axis=0))
print("Stds  (should be ~1):", X_scaled.std(axis=0))

Example: 
<pre>
A      (2d array):  4 x 3
B      (1d array):      1
Result (2d array):  4 x 3
</pre>

In [None]:
A = np.ones((4, 3))
A

In [None]:
B = 2

What is the result of A + B?

Example: 
<pre>
A      (2d array):  4 x 3
B      (1d array):      3
Result (2d array):  4 x 3
</pre>


In [None]:
A

In [None]:
B = np.ones(3)
B

What is the result of A + B?

### Sorting

`np.sort()` return a new sorted array

In [None]:
x = np.random.randint(1, 100, 6)
x

What is the result of np.sort(x)?

The ndarray `sort` instance method is an in-place sort, meaning that the array contents are regarranged without producing a new array.

What is the result of x.sort()?

`argsort()` returns an array of integer indices that tells you how to reorder the data to be in sorted order

In [None]:
x = np.random.randint(1, 100, 6)
x

What is the result of x.argsort()?

---

# Part 3 — NumPy Case Study


# NumPy Case Study — Social Network Analysis (Revisited)

## Finding key connectors
You’re a new data scientist, and your VP asks: **Who are the “key connectors”** in the user network?

We’ll represent the network as an **adjacency matrix** and use linear algebra (eigenvectors) to compute a simple centrality score.

In [2]:
# A list of users, each rpresented by a dict that contai+ns for each uer id and name
users = [
    { "id": 0, "name": "Hero" },
    { "id": 1, "name": "Dunn" },
    { "id": 2, "name": "Sue" },
    { "id": 3, "name": "Chi" },
    { "id": 4, "name": "Thor" },
    { "id": 5, "name": "Clive" },
    { "id": 6, "name": "Hicks" },
    { "id": 7, "name": "Devin" },
    { "id": 8, "name": "Kate" },
    { "id": 9, "name": "Klein" }
]

# the "friendship" data is represented as a list of paris of IDs
friendships = [(0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 4),
               (4, 5), (5, 6), (5, 7), (6, 8), (7, 8), (8, 9)]


In graph theory, eigenvector centrality (also called eigencentrality) is a measure of the influence of a node in a network. Relative scores are assigned to all nodes in the network based on the concept that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores.

Google's PageRank and the Katz centrality are variants of the eigenvector centrality.

<b>The premise is that a node's importance is determined by how important its neighbors are.</b>

Create an **adjacency matrix** $G$ for the network graph, where $G[i, j] = 1$ if users $i$ and $j$ are friends (and 0 otherwise).

In [4]:
import numpy as np
# find number of users
n = len(users)
# create matrix of all zeros
G = np.zeros((n, n))
# iterate friendships
for i in friendships:
    x, y = i
    G[x, y] = 1
    G[y, x] = 1

In [5]:
G

array([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
       [1., 0., 1., 1., 0., 0., 0., 0., 0., 0.],
       [1., 1., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 1., 1., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 1., 1., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 1., 1., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.]])

## Modeling 
Denote importance (eigenvector centrality) of node $i$ as $x_i$. Then a node's importance is proportional to the sum of the importance of its neighbors: 

### <center>$\sum_j G(i, j)x_j = \lambda x_i $ </center>

where $\lambda$ is a constant.
If using $X$ to represent the vector of importance of nodes $[x_1, ..., x_n]$, we have: 

### <center> $GX = \lambda X$ </center>

### Eigenvector and eigenvalues
In linear algebra, for equation $GX = \lambda X$, we call $X$ eigenvector and $\lambda$ eigenvalue. 

### Eigenvector centrality
It generally makes sense that a meausre of node importance is non-negative. So the eigenvector centrality of nodes is an eigenvector of the adjacency matrix such that all of its elements are positive. The standard convention is to look for the eigenvector associated with the largest eigenvalue, although it might not be true. 

#### <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.eig.html"> `numpy.linalg.eig(G)` </a> 

- Compute the eigenvalues and eigenvectors of a square array.
- Returns
    - w : The eigenvalues, each repeated according to its multiplicity. The eigenvalues are not necessarily ordered. The resulting array will be of complex type, unless the imaginary part is zero in which case it will be cast to a real type. When a is real the resulting eigenvalues will be real (0 imaginary part) or occur in conjugate pairs
    - v :  The normalized (unit “length”) eigenvectors, such that the column v[:,i] is the eigenvector corresponding to the eigenvalue w[i].

Compute eigenvalues and eigenvectors

In [6]:
(eigenvalues, eigenvectors) = np.linalg.eig(G)

In [8]:
eigenvalues

array([-2.28606304e+00, -1.76854651e+00,  2.66882292e+00,  2.22467637e+00,
        1.13080448e+00, -3.69021730e-01,  3.99327521e-01, -1.00000000e+00,
       -1.00000000e+00, -3.92523115e-17])

In [11]:
eigenvalues.shape

(10,)

In [12]:
eigenvectors

array([[-7.78306087e-02,  3.96167522e-01,  3.85780947e-01,
         1.35826264e-01,  2.54977291e-01,  5.19529199e-01,
         3.91121194e-01, -4.26401433e-01, -2.56663501e-02,
        -3.92523115e-17],
       [ 8.89628390e-02, -3.50320345e-01,  5.14790516e-01,
         1.51084740e-01,  1.44164731e-01, -9.58587819e-02,
         7.80927283e-02,  2.13200716e-01,  7.18657803e-01,
         5.07490473e-17],
       [ 8.89628390e-02, -3.50320345e-01,  5.14790516e-01,
         1.51084740e-01,  1.44164731e-01, -9.58587819e-02,
         7.80927283e-02,  2.13200716e-01, -6.92991453e-01,
        -5.07490473e-17],
       [-2.14506889e-01,  5.73710649e-01,  4.73313262e-01,
         4.92036477e-02, -2.36119899e-01, -3.88296444e-01,
        -4.38029346e-01,  2.23589522e-16, -1.21894513e-16,
         3.92523115e-17],
       [ 3.12450593e-01, -3.13993278e-01,  2.33608249e-01,
        -1.92707288e-01, -5.55334901e-01,  3.35007389e-01,
        -3.31102630e-01, -4.26401433e-01, -2.56663501e-02,
         1.

Find out the eigenvector associated with the largest eigenvalue

In [13]:
eigenvalues.max()

2.6688229150885756

In [14]:
eigenvalues.argmax()

2

Eigenvector centrality

### Eigenvector centrality (NumPy implementation)

We take the eigenvector corresponding to the **largest eigenvalue** of the adjacency matrix.
Intuition: a node is important if it connects to other important nodes.

Steps:
1. Find index of the largest eigenvalue
2. Extract the corresponding eigenvector
3. Convert to a real-valued score and normalize
4. Rank users by score

In [None]:
# Pick the eigenvector associated with the largest eigenvalue
idx = np.argmax(eigenvalues.real)   # eigenvalues may be complex; use real part for comparison
centrality = eigenvectors[:, idx].real

# Make scores non-negative and normalize for interpretability
centrality = np.abs(centrality)
centrality = centrality / centrality.sum()

print("Top 10 users by eigenvector centrality (index, score):")
top_k = np.argsort(-centrality)[:10]
for u in top_k:
    print(u, round(float(centrality[u]), 6))