# Exercises for using NumPy
Import the numpy module

In [2]:
import numpy as np


## Warm-Up Exercises

1. **Create a NumPy array**: Create a NumPy array of integers from 1 to 10.

2. **Array shape**: Find the shape of the array you created.

3. **Array data type**: Find the data type of the array. Convert it to da different data type.

4. **Array operations**: Perform basic arithmetic operations (addition, subtraction, multiplication, division) on the array.

5. **Reshape array**: Reshape the array into a 2x5 matrix.

6. **Indexing and slicing**: Access the third element of the array and slice the array to get the first 5 elements.

7. **Array statistics**: Calculate the sum, mean, and standard deviation of the array.

8. **Boolean indexing**: Create a boolean array that selects only the even numbers from the original array.

9. **Broadcasting**: Add a scalar (e.g., 5) to the original array using broadcasting.

In [10]:
# 1
arr1 = np.arange(1, 11)

# 2
print(arr1.shape)

# 3
print(arr1.dtype)

# 4
print(arr1 - np.random.rand(10))

# 5
print(np.reshape(arr1, (2,5)))

# 6
print(arr1[2], arr1[:5])

#7
print(np.sum(arr1), np.mean(arr1), np.std(arr1))

# 8
print(arr1[arr1 % 2 == 0])

# 9
print(arr1 + 5)


(10,)
int32
[0.39147682 1.46069038 2.23596364 3.22465468 4.17158746 5.85682316
 6.95714298 7.51080833 8.61438107 9.83042129]
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
3 [1 2 3 4 5]
55 5.5 2.8722813232690143
[ 2  4  6  8 10]
[ 6  7  8  9 10 11 12 13 14 15]


## Working with XRD data.
The  file `../Data/exercise/fco.txt` contains a cut output file from a crystallographic refinement. If you do not want to you can also just name the variables $A$-$F$, the context is not really that important.

Here are the exercises for working with the SC-XRD data:

1. **Read the data**: Use NumPy to read the data from the file `data/fco.txt` mentioned above. The data contains six columns:
    - $h$ : integer, Miller index
    - $k$ : integer, Miller index
    - $l$ : integer, Miller index
    - $F^2_\text{calc,i}$ : calculated scaled intensity from the model
    - $F^2_\text{obs,i}$ : observed scaled intensity from the model
    - $\sigma_i$ : estimated standard deviation of the scaled observed intensity

Note: Make sure that you take into account that the data file already contains the squared values.


In [20]:
h, k, l, fsq_calc, fsq_obs, sigma = np.loadtxt('../Data/exercise/fco.txt', unpack=True, dtype='i8,i8,i8,f8,f8,f8')


2. **Calculate the mean I/sigma**: Calculate the mean of the ratio of observed intensity to the estimated standard deviation for the dataset. The formula for calculating the mean is:

   $$\overline{I/\sigma} = \frac{\sum_{i=1}^{n} \frac{F^2_\text{obs,i}}{\sigma_i}}{n}$$

   where $F^2_\text{obs,i}$ is the observed absolute squared structure factor and $\sigma_i$ is the estimated standard deviation for the $i$ th data point.


In [21]:
np.nanmean(fsq_obs / sigma)

22.894137052287043


3. **Calculate $R_1$**: Calculate the R1 value for the dataset. R1 is a measure of the agreement between the observed and modelled absolute squared structure factors. The formula for calculating R1 is:

   $$R_1 = \frac{\sum_{i=1}^{n} ||F^2_\text{calc,i} - F^2_\text{obs,i}||}{\sum_{i=1}^{n} |F^2_\text{obs,i}|}$$

   where $F^2_\text{obs,i}$ is the observed absolute squared structure factor and $F^2_\text{calc,i}$ is the modelled absolute squared structure factor for the $i$th data point.


In [22]:
np.sum(np.abs(fsq_calc - fsq_obs)) / np.sum(np.abs(fsq_obs))

0.08166633291590505


4. **Calculate wR2**: Calculate the weighted R2 value for the dataset. wR2 is another measure of the agreement between the observed and modelled absolute squared structure factors, taking into account the estimated standard deviation. The formula for calculating wR2 is:

   $$wR_2 = \frac{\sum_{i=1}^{n} \left( \frac{F^2_\text{obs,i} - F^2_\text{calc,i}}{\sigma_i} \right)^2}{\sum_{i=1}^{n} \left( \frac{F^2_\text{obs,i}}{\sigma_i} \right)^2}$$

   where $F^2_\text{obs,i}$ is the observed absolute squared structure factor, $F^2_\text{calc,i}$ is the modelled absolute squared structure factor, and $\sigma_i$ is the estimated standard deviation for the $i$th data point.


In [23]:
np.sum(((fsq_calc - fsq_obs)) / sigma)**2 / np.sum((fsq_obs/sigma)**2)

0.4078176754345079