In [None]:
with open('37-w1-notes.txt', 'r') as f:
    content = f.read()
print(content)

-1.085630603300561159e+00
9.973454465835858151e-01
2.829784980519920445e-01
-1.506294713918091999e+00
-5.786002519685363721e-01
1.651436537097151103e+00
-2.426679243393074170e+00
-4.289126288561772582e-01
1.265936258705534057e+00
-8.667404022651017392e-01



In [17]:
import numpy as np

# Load data from the file for verification
loaded_data_from_file = np.loadtxt('37-w1-notes.txt')
print("Loaded data from file:", loaded_data_from_file)

# Regenerate data using the reproducible function for comparison
newly_generated_data = generate_data()
print("Newly generated data:", newly_generated_data)

# Compare the two arrays to verify reproducibility
is_data_identical = np.array_equal(loaded_data_from_file, newly_generated_data)
print("Are loaded data and newly generated data identical?", is_data_identical)

assert is_data_identical, "Verification failed: Loaded data and newly generated data are not identical!"
print("Verification successful: The content of '68-w1-notes.txt' matches the generated data.")

Loaded data from file: [-1.0856306   0.99734545  0.2829785  -1.50629471 -0.57860025  1.65143654
 -2.42667924 -0.42891263  1.26593626 -0.8667404 ]
Newly generated data: [-1.0856306   0.99734545  0.2829785  -1.50629471 -0.57860025  1.65143654
 -2.42667924 -0.42891263  1.26593626 -0.8667404 ]
Are loaded data and newly generated data identical? True
Verification successful: The content of '68-w1-notes.txt' matches the generated data.


## Subtask 1
### Reasons for Changes in Experimental Results Between Runs

Experimental results can often vary between runs, leading to reproducibility challenges. Here are the primary reasons why this might occur:

1.  **Lack of a fixed random seed**: Many simulations use random processes. Without setting a fixed random seed at the beginning of an experiment, the sequence of random numbers generated will differ in each run, leading to different different results.

2.  **Undocumented or unversioned experimental parameters**: If certain parameters used in the experiment are controlled, they might change between runs.

3.  **Differences in the computational environment**: The environment in which an experiment is run plays a crucial role. Variations.

## Subtask 2
### Baseline



In [None]:
import numpy as np
print("Numpy imported successfully.")

Numpy imported successfully.


In [None]:
print("--- Non-reproducible example without seed ---")
print("Run 1:", np.random.randn(10))
print("Run 2:", np.random.randn(10))
print("As you can see, the numbers are different each time.")

--- Non-reproducible example without seed ---
Run 1: [-1.26874199 -0.43534727 -0.08591081 -0.49485638 -0.58414494 -0.60913843
  1.18141538 -1.47171162  1.82949492 -0.21735203]
Run 2: [-0.45825564  0.08667603 -0.9422228   0.59702217  1.271634    0.07357013
 -0.68620499 -1.14719801  1.09694685 -1.33761819]
As you can see, the numbers are different each time.


In [None]:
print("--- Reproducible example with seed ---")
np.random.seed(123)
print("Run 1 (after seeding):", np.random.randn(10))

np.random.seed(123)
print("Run 2 (after re-seeding):", np.random.randn(10))
print("As you can see, the numbers are identical when the same seed is set.")

--- Reproducible example with seed ---
Run 1 (after seeding): [-1.0856306   0.99734545  0.2829785  -1.50629471 -0.57860025  1.65143654
 -2.42667924 -0.42891263  1.26593626 -0.8667404 ]
Run 2 (after re-seeding): [-1.0856306   0.99734545  0.2829785  -1.50629471 -0.57860025  1.65143654
 -2.42667924 -0.42891263  1.26593626 -0.8667404 ]
As you can see, the numbers are identical when the same seed is set.


In [None]:
def generate_data():
    np.random.seed(123)
    return np.random.randn(10)

print("The 'generate_data' function has been defined.")

The 'generate_data' function has been defined.


In [None]:
print("--- Reproducible output from function calls ---")
print("Function Call 1:", generate_data())
print("Function Call 2:", generate_data())
print("As expected, the function consistently returns the same data.")

--- Reproducible output from function calls ---
Function Call 1: [-1.0856306   0.99734545  0.2829785  -1.50629471 -0.57860025  1.65143654
 -2.42667924 -0.42891263  1.26593626 -0.8667404 ]
Function Call 2: [-1.0856306   0.99734545  0.2829785  -1.50629471 -0.57860025  1.65143654
 -2.42667924 -0.42891263  1.26593626 -0.8667404 ]
As expected, the function consistently returns the same data.


## Subtask 3
### Assumptions in the Reproducibility Experiment

In the previous demonstration of reproducibility, several assumptions were made :

1.  **Fixed Random Seed (`seed=123`)**: A specific integer value, `123`, was chosen and consistently applied as the random seed using `np.random.seed(123)`. This choice is arbitrary but crucial for ensuring that the sequence of pseudo-random numbers generated is identical across different runs, thereby establishing reproducibility.

2.  **Generation of 10 Standard Normal Samples**: The `np.random.randn(10)` function was used to generate an array of 10 samples from a standard normal (Gaussian) distribution (mean = 0, variance = 1). The number `10` was chosen for simplicity, while `randn()` provides a common source of random numbers in scientific computing.

3.  **Use of `np.random.seed()` for Determinism**: The primary reason for explicitly calling `np.random.seed()` before each generation of random numbers in the reproducible examples was to force the random number generator into a known state. This guarantees that subsequent calls to random number generation functions (`np.random.randn()` in this case) will produce the exact same sequence of numbers, thereby demonstrating determinism.

4.  **Encapsulation in `generate_data()` for Modularity**: The data generation logic was encapsulated within a function `generate_data()`. This design choice makes the code cleaner and easier to manage. More importantly, it demonstrates how internal seeding within a function can ensure that every call to that function yields the exact same output

In [16]:
print("--- Generating and Saving Data ---")
generated_data = generate_data()
np.savetxt('37-w1-notes.txt', generated_data)

print(f"Generated data:\n{generated_data}")
print("Data saved to '37-w1-notes.txt'.")

--- Generating and Saving Data ---
Generated data:
[-1.0856306   0.99734545  0.2829785  -1.50629471 -0.57860025  1.65143654
 -2.42667924 -0.42891263  1.26593626 -0.8667404 ]
Data saved to '37-w1-notes.txt'.


### How to Verify Reproducibility

To verify the reproducibility of the experiment, follow these steps:

1.  **Rerun the entire notebook**: Close and reopen this notebook, then execute all cells from top to bottom. This simulates a fresh run of the experiment.

2.  **Load the saved data**: After rerunning, execute the following code in a new cell to load the data previously saved to `37-w1-notes.txt`:
    ```python
    loaded_array = np.loadtxt('37-w1-notes.txt')
    print("Loaded data from file:", loaded_array)
    ```

3.  **Generate new data**: In the same (or another new) cell, call the `generate_data()` function again to get a fresh set of reproducible data:
    ```python
    newly_generated_array = generate_data()
    print("Newly generated data:", newly_generated_array)
    ```

4.  **Compare the outputs**: To confirm reproducibility, compare the `loaded_array` with the `newly_generated_array`. They should be identical if the experiment is reproducible, as both rely on the same fixed seed within the `generate_data()` function.
    ```python
    comparison_result = np.array_equal(loaded_array, newly_generated_array)
    print("Are loaded data and newly generated data identical?", comparison_result)
    assert comparison_result, "Reproducibility check failed: data arrays are not identical!"
    print("Reproducibility successfully verified!")
    ```

If the `comparison_result` is `True`, it confirms that the data generated using the `generate_data()` function, even after saving and reloading, remains consistent across different runs, demonstrating successful reproducibility.