Add integration tests for application notebooks #619

jan-janssen · 2025-03-28T10:20:34Z

Summary by CodeRabbit

New Features
- Introduced a comprehensive environment configuration for streamlined dependency management.
- Integrated an automated workflow that executes interactive notebooks showcasing computational materials science simulations with visualized energy evaluations.
Documentation
- Enhanced the content structure with a new application section, providing guidance on calculating material properties and improving overall organization.
- Added direct links to documentation for GPAW and Quantum Espresso applications.

review-notebook-app · 2025-03-28T10:20:39Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

coderabbitai · 2025-03-28T10:20:46Z

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)

notebooks/5-2-quantum-espresso.ipynb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request introduces a new Conda environment configuration file with predefined channels and dependencies. It adds a new GitHub Actions job to run Jupyter notebook integrations based on the updated environment, thereby extending the CI/CD pipeline. The documentation is updated with a new chapter and subsections for application examples. Furthermore, two new Jupyter notebooks are added to perform computational materials science analyses using GPAW and Quantum Espresso, featuring functions for energy evaluations, structure generation, task submission via Flux executors, and subsequent plotting.

Changes

File(s)	Change Summary
`.ci_support/environment-integration.yml` `.github/workflows/pipeline.yml`	Introduces a new Conda environment configuration file with specified channels and dependencies; adds a new CI job `notebooks_integration` to set up the environment, install packages, and execute notebook tests.
`docs/_toc.yml` `docs/application.md`	Adds a new documentation chapter (`application.md`) with subsections for `5-1-gpaw.ipynb` and `5-2-quantum-espresso.ipynb`; includes a brief description for bulk modulus calculation.
`notebooks/5-1-gpaw.ipynb` `notebooks/5-2-quantum-espresso.ipynb`	Introduces two new Jupyter notebooks for computational materials science; each notebook defines functions to evaluate energies using GPAW and Quantum Espresso, manage task execution with Flux executors, and visualize results.

Sequence Diagram(s)

sequenceDiagram
    participant CI as "CI Job"
    participant Repo as "Repository"
    participant Env as "Mambaforge Setup"
    participant Exec as "Notebook Executor"
    participant Flux as "Flux Cluster"
    participant Res as "Results"

    CI->>Repo: Checkout code & merge notebook config
    CI->>Env: Setup Mambaforge (Python 3.12) & install dependencies
    Env-->>CI: Environment ready
    CI->>Exec: Initiate notebooks for GPAW & QE tests
    Exec->>Flux: Submit tasks via FluxClusterExecutor
    Flux-->>Exec: Process tasks and return outputs
    Exec->>CI: Report notebook test results
    CI->>Res: Finalize and log job status

Possibly related PRs

Execute notebooks individually #617: The changes in the main PR, which introduce a new job for executing Jupyter notebooks in the CI pipeline, are related to the modifications in the retrieved PR that also involve executing notebooks, albeit through a different method. Both PRs focus on enhancing the execution of Jupyter notebooks within the CI/CD process.

Poem

In a burrow of code I hop with delight,
New configs and notebooks shining bright.
CI paths like winding tunnels, so neat,
Running tests and tasks on nimble feet.
My digital carrots dance in the light!

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai plan to trigger planning for file edits and PR creation.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

codecov · 2025-03-28T10:23:10Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.44%. Comparing base (c4ce1e1) to head (63733e7).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #619   +/-   ##
=======================================
  Coverage   96.44%   96.44%           
=======================================
  Files          28       28           
  Lines        1265     1265           
=======================================
  Hits         1220     1220           
  Misses         45       45

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (13)

.ci_support/environment-integration.yml (1)
1-21: A well-structured environment configuration for integration testing.

This new Conda environment file correctly defines all the necessary dependencies for running the integration test notebooks. The use of specific versions for most packages (like cloudpickle =3.1.1, flux-core =0.59.0, etc.) ensures reproducibility across different environments.

Add a newline at the end of the file to comply with YAML best practices:
 - atomistics =0.2.4
 - qe =7.2
 - gpaw =24.6.0
+
🧰 Tools

🪛 YAMLlint (1.35.1)

[error] 21-21: no new line character at the end of file

(new-line-at-end-of-file)
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (2)
234-240: Invalid escape sequence in string literal.

The warning about invalid escape sequence \A indicates an issue with the LaTeX formatting in the axis label.

Fix the invalid escape sequence by using a raw string:
-  plt.xlabel("Volume [$\\AA^3$]")
+  plt.xlabel(r"Volume [$\AA^3$]")
Using a raw string (with the r prefix) avoids the need to escape backslashes in LaTeX expressions.

187-209: Atomic simulation tasks run sequentially despite using a job executor.

The code correctly creates directories for each strain calculation and submits jobs to the Flux executor. However, it then immediately collects results synchronously in the same loop, which does not fully leverage the executor's parallel execution capabilities.

Consider separating job submission from result collection to better leverage parallel execution:
    future_dict = {}
    with FluxJobExecutor(flux_log_files=True, flux_executor_nesting=True) as exe:
        for k, v in task_loop_dict.items():
            os.makedirs(os.path.abspath(("strain_%0.2f" % k).replace(".", "_")), exist_ok=True)
            future_dict[k] = exe.submit(
                evaluate_with_quantum_espresso, 
                task_dict=v, 
                pseudopotentials=pseudopotentials, 
                kpts=(3, 3, 3), 
                resource_dict={
                    "cores": 1, 
                    "threads_per_core": 2, 
                    "cwd": os.path.abspath(("strain_%0.2f" % k).replace(".", "_")),
                },
            )
        sleep(1)
        pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n"))
+       # Wait for all jobs to complete and collect results
        result_dict = {
            k: f.result() 
            for k, f in tqdm(future_dict.items())
        }
        sleep(1)
        pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n"))
This doesn't change functionality but clarifies that you're submitting all jobs first, then collecting results.
notebooks/5-2-flux-job-gpaw-bulk-modulus.ipynb (3)
64-73: Well-defined GPAW evaluation function

The evaluate_with_gpaw function properly configures the GPAW calculator with appropriate parameters. Consider adding docstrings to improve documentation.
 def evaluate_with_gpaw(task_dict, kpts, encut):
+    """
+    Evaluate energy of a structure using GPAW with plane-wave mode.
+    
+    Parameters:
+        task_dict (dict): Dictionary containing structure in 'calc_energy'
+        kpts (tuple): k-point sampling as a 3-tuple
+        encut (float): Energy cutoff in eV
+        
+    Returns:
+        float: Potential energy of the structure
+    """
     from gpaw import GPAW, PW

     structure = task_dict["calc_energy"].copy()
     structure.calc = GPAW(
         xc="PBE",
         mode=PW(encut),
         kpts=kpts,
     )
     return structure.get_potential_energy()
170-188: Well-structured job execution loop

The code effectively uses FluxJobExecutor to submit parallel jobs, monitors their status, and collects results. The tqdm progress bar provides good user feedback during execution. Consider adding basic error handling for job failures.
 future_dict = {}
 with FluxJobExecutor() as exe:
     for k, v in task_loop_dict.items():
         future_dict[k] = exe.submit(
             evaluate_with_gpaw, 
             task_dict=v, 
             kpts=(3, 3, 3), 
             encut=300,
             resource_dict={"cores": 2},
         )
     sleep(1)
     pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n"))
-    result_dict = {
-        k: f.result()[-1] 
-        for k, f in tqdm(future_dict.items())
-    }
+    result_dict = {}
+    for k, f in tqdm(future_dict.items()):
+        try:
+            result_dict[k] = f.result()[-1]
+        except Exception as e:
+            print(f"Error in job {k}: {e}")
     sleep(1)
     pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n"))
243-247: Consider enhancing the visualization

While the current plot is functional, consider adding data points to show the calculated values alongside the fitted curve, and potentially adding a title to make the plot more informative.
+plt.figure(figsize=(8, 6))
+plt.scatter(fit_dict["volume"], fit_dict["energy"], color='red', label='Calculated points')
 plt.plot(fit_dict["volume"], fit_dict["energy"], label="$B_0$= %0.2f GPa" % fit_dict["bulkmodul_eq"])
 plt.xlabel("Volume [$\\mathrm{\\AA}^3$]")
 plt.ylabel("Energy [eV]")
+plt.title("Al Bulk Modulus Calculation with GPAW")
 plt.legend()
+plt.grid(True, linestyle='--', alpha=0.7)
notebooks/5-1-flux-cluster-gpaw-bulk-modulus.ipynb (3)
64-73: The evaluation function is well structured but lacks docstring.

The evaluate_with_gpaw function is appropriately defined with clear parameters and a logical implementation. However, adding a docstring would improve maintainability and help other developers understand its purpose, parameters, and return value.

Consider adding a detailed docstring:
def evaluate_with_gpaw(task_dict, kpts, encut):
+    """
+    Evaluate the potential energy of a structure using GPAW.
+    
+    Parameters:
+        task_dict (dict): Dictionary containing the structure to evaluate
+        kpts (tuple): k-points mesh (e.g., (3, 3, 3))
+        encut (float): Plane-wave energy cutoff in eV
+        
+    Returns:
+        float: Potential energy of the structure in eV
+    """
    from gpaw import GPAW, PW

    structure = task_dict["calc_energy"].copy()
    structure.calc = GPAW(
        xc="PBE",
        mode=PW(encut),
        kpts=kpts,
    )
    return structure.get_potential_energy()
38-44: Consider documenting the purpose of key parameters in structure generation.

The structure generation parameters are functionally correct but lack context about their significance for this particular calculation.

Add comments to explain the importance of these parameters:
structure_dict = evcurve_generate_structures(
    structure=bulk("Al", a=4.05, cubic=True),  # Aluminum with cubic structure
+    # Using 7 points gives sufficient data for accurate equation of state fitting
    num_points=7,
+    # 5% volume range provides appropriate sampling around equilibrium volume
    vol_range=0.05,
+    # Isotropic scaling of all axes preserves the crystal symmetry
    axes=("x", "y", "z"),
)
173-175: Consider adding a comment about the computational parameters.

The k-points grid and energy cutoff parameters are critical for accuracy and performance in GPAW calculations, but their values are provided without explanation.

Add comments to explain the choice of parameters:
future_dict[k] = exe.submit(
    evaluate_with_gpaw, 
    task_dict=v, 
+    # 3x3x3 k-points grid provides good accuracy/performance balance for Al
    kpts=(3, 3, 3), 
+    # 300 eV cutoff is sufficient for converged results with PW basis
    encut=300,
    resource_dict={"cores": 2},
)
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (4)
74-90: Hardcoded pseudopotential path may cause portability issues.

I noticed the pseudopotential directory is hardcoded to "/home/jan/espresso/pseudo" on line 87, which could fail when running on different environments or CI systems.

Consider making the pseudo_dir configurable through an environment variable or parameter:
def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
    from ase.calculators.espresso import Espresso, EspressoProfile
    from atomistics.calculators import evaluate_with_ase
    
+   import os
+   pseudo_dir = os.environ.get("QE_PSEUDO_DIR", "/home/jan/espresso/pseudo")
    
    return evaluate_with_ase(
        task_dict=task_dict,
        ase_calculator=Espresso(
            pseudopotentials=pseudopotentials,
            tstress=True,
            tprnfor=True,
            kpts=kpts,
            profile=EspressoProfile(
                 command="pw.x",
-                pseudo_dir="/home/jan/espresso/pseudo",
+                pseudo_dir=pseudo_dir,
            ),
        ),
    )["energy"]
257-260: Fix the invalid escape sequence warning.

There's a syntax warning about invalid escape sequence \A in your plot xlabel. This should be fixed for clean execution without warnings.
- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel("Volume [$\\mathrm{\\AA}^3$]")
Alternatively, you can use raw strings to avoid escape sequence issues:
plt.xlabel(r"Volume [$\AA^3$]")
211-216: Consider adding error handling for failed calculations.

The current implementation collects results from all futures but doesn't handle potential exceptions if any calculations fail. In a production environment, this could cause the entire workflow to fail.

Add exception handling when collecting results:
    result_dict = {
-       k: f.result() 
+       k: f.result() if not f.exception() else None
        for k, f in tqdm(future_dict.items())
    }
+   # Filter out failed calculations
+   if None in result_dict.values():
+       print("Warning: Some calculations failed. Proceeding with successful ones only.")
+       result_dict = {k: v for k, v in result_dict.items() if v is not None}
1-293: Add documentation comments to explain calculation parameters.

The notebook contains good functional code but lacks explanatory comments for key calculation parameters like k-point mesh (3,3,3) and polynomial fit order (3). These choices impact results and reproducibility.

Add comments to explain key parameters:

Near line 190: # Using a 3x3x3 k-point mesh, sufficient for metallic Al ground state convergence

Near line 215: # Using 3rd order polynomial for E-V fitting - standard approach for bulk modulus calculation

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a532b9d and a31e50d.

📒 Files selected for processing (6)

.ci_support/environment-integration.yml (1 hunks)
.github/workflows/pipeline.yml (1 hunks)
notebooks/5-1-flux-cluster-gpaw-bulk-modulus.ipynb (1 hunks)
notebooks/5-2-flux-job-gpaw-bulk-modulus.ipynb (1 hunks)
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1 hunks)
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1 hunks)

🧰 Additional context used

🪛 YAMLlint (1.35.1)

.ci_support/environment-integration.yml

[error] 21-21: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)

🔇 Additional comments (13)

.github/workflows/pipeline.yml (1)

176-210: Well-structured CI job for notebook integration tests.

This new notebooks_integration job correctly sets up the necessary environment and executes the Quantum Espresso and GPAW bulk modulus notebooks. The job is properly integrated with the existing CI pipeline (depends on the black job) and sets appropriate environment variables needed for the tests.

A few observations:

Line 203 uses 5-1-flux-cluster-qe-bulk-modulus.ipynb when the actual notebook filename pattern follows 5-4-flux-job-qe-bulk-modulus.ipynb

Line 202 uses a similar pattern with 5-2-flux-job-qe-bulk-modulus.ipynb

The filenames should be consistent between the workflow and the actual repository structure.

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1)

1-301: Well-structured notebook for Quantum Espresso bulk modulus calculations.

This notebook provides a clear implementation of aluminum bulk modulus calculation using Quantum Espresso through the Flux job executor. It follows a logical flow:

Imports necessary libraries and defines pseudopotentials

Generates structures for energy vs. volume calculations

Defines the evaluation function that interfaces with Quantum Espresso

Uses FluxJobExecutor to submit and manage the calculations

Analyzes results and plots the energy vs. volume curve

notebooks/5-2-flux-job-gpaw-bulk-modulus.ipynb (6)

1-10: Well-organized notebook structure

The notebook has a clear title and follows a logical structure for calculating the bulk modulus using GPAW. This makes it a good integration test for the workflow.

18-29: Comprehensive imports for the bulk modulus calculation

All necessary libraries are properly imported, including ASE for atomic simulations, atomistics workflows, executorlib for job management, and visualization tools. The code organization follows best practices.

39-44: Appropriate structure generation for aluminum

The code correctly uses ASE's bulk function to create an aluminum structure and generates 7 structures with 5% volume variation to sample the energy-volume curve. This is a suitable approach for bulk modulus calculation.

54-54: Clear task dictionary creation

The dictionary comprehension efficiently transforms the structure dictionary into the task format required for calculations.

197-202: Appropriate use of structure analysis

The notebook correctly uses the evcurve_analyse_structures function to fit the energy-volume data and extract the bulk modulus. This approach follows standard practice in computational materials science.

1-257: Excellent integration test for GPAW bulk modulus calculation

Overall, this notebook provides a well-structured and functional integration test for calculating the bulk modulus of aluminum using GPAW with FluxJobExecutor. It demonstrates the full workflow from structure generation to analysis, making it a valuable addition to the test suite.

The notebook would benefit from additional documentation, error handling, and enhanced visualization, but these are minor improvements to an otherwise solid implementation.

notebooks/5-1-flux-cluster-gpaw-bulk-modulus.ipynb (4)

167-184: Well-implemented parallel execution with proper progress tracking.

The FluxClusterExecutor implementation is a good example of parallel task execution with appropriate resource allocation and progress monitoring. The code uses a context manager, submits tasks with specific resource requirements, and properly tracks progress with tqdm.

194-199: The analysis configuration appears appropriate for bulk modulus calculation.

The equation of state analysis using polynomial fitting is well configured for extracting the bulk modulus from the energy-volume data.

18-29: Well-organized imports with appropriate dependency selection.

The imports are properly organized with standard libraries first, followed by specialized scientific libraries. The specific imports rather than wildcard imports follow best practices.

240-244: The plotting code is concise and effective.

The plot effectively visualizes the energy-volume relationship and displays the calculated bulk modulus in the legend.

Once you fix the escape sequence warning, the plot will be complete.

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1)

184-201: LGTM! Effective use of parallel execution pattern.

The FluxClusterExecutor implementation efficiently distributes the computation tasks across available resources. Using a context manager (with) ensures proper resource cleanup. The progress bar with tqdm provides good user feedback for long-running calculations.

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb

notebooks/5-2-flux-job-gpaw-bulk-modulus.ipynb

notebooks/5-1-flux-cluster-gpaw-bulk-modulus.ipynb

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

.github/workflows/pipeline.yml (2)

179-184: Environment Merge Step Verification
The "Merge Notebook environment" step creates a minimal .condarc file using an echo command. Ensure that this configuration aligns with all integration requirements and that the referenced environment file (.ci_support/environment-integration.yml) is correctly set up with all necessary channels and dependencies.

196-203: Notebook Execution Commands and Timeout Configuration
The "Notebooks" step executes four notebook tests via flux start papermill with a 20-minute timeout. Ensure that both flux and papermill are reliably available in the integration environment, and consider capturing log outputs to facilitate debugging in case any of the notebooks fail during execution.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6138780 and d046292.

📒 Files selected for processing (1)

.github/workflows/pipeline.yml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: notebooks_integration
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)

🔇 Additional comments (4)

.github/workflows/pipeline.yml (4)

176-178: Job Initialization: notebooks_integration Added
The new integration test job is defined to depend solely on the black job. Verify if additional dependencies (e.g., setup or prior notebook test outcomes) are needed to ensure reliable execution of the integration tests.

185-190: Mambaforge Setup with Python 3.12
The job uses Python 3.12 in this integration test, which differs from other jobs that may run on 3.13. Confirm that Python 3.12 is the intended version for these tests, or update accordingly for consistency if needed.

191-195: Installation Step Consistency
The installation commands—installing versioneer[toml]==0.29 and the package without dependencies or build isolation—mirror those in other jobs and seem appropriate for ensuring a clean environment.

204-210: Environment Variables Configuration Check
The environment variables (e.g., ASE_ESPRESSO_COMMAND, ESPRESSO_PSEUDO, and the OpenMPI-related settings) are crucial for the correct execution of the notebooks. Please verify that ESPRESSO_PSEUDO correctly points to the intended integration test directory and consider adding inline comments or documentation for these variables to aid future troubleshooting.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

.github/workflows/pipeline.yml (1)

180-184: Checkout and Environment Merge Step Clarity
The step uses actions/checkout@v4 followed by a "Merge Notebook environment" step that only writes a simple .condarc file with the conda-forge channel. Although this meets the minimal requirement for Conda configuration, the step name may be slightly misleading if no merging of multiple environment files is taking place. Consider renaming it (e.g., "Configure Conda Channels") or expanding it to perform an actual merge if intended.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d046292 and 7db29e7.

📒 Files selected for processing (1)

.github/workflows/pipeline.yml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_win
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: notebooks_integration

🔇 Additional comments (4)

.github/workflows/pipeline.yml (4)

176-179: Job Definition: Notebooks Integration Job Added
The new notebooks_integration job is properly defined with a dependency on the black job and set to run on ubuntu-latest. Ensure that depending solely on black is sufficient for the integration context, or if additional prerequisites should be chained.

185-190: Setup Mambaforge with Integration Environment
The "Setup Mambaforge" step is correctly configured using Python 3.12 and the latest miniforge version. The step specifies the .ci_support/environment-integration.yml file, ensuring the job uses the correct dependencies for notebook execution. Please verify that the integration environment file includes all necessary packages for running your GPAW and Quantum Espresso notebooks.

191-196: Installation Step Consistency
The installation step installs versioneer[toml]==0.29 and the current package without dependency or build isolation. This mirrors similar steps in other pipeline parts and should work well for ensuring the package is ready for notebook execution.

197-209: Notebooks Execution and Environment Variables
The "Notebooks" step uses flux start papermill to run four notebook files with a 20-minute timeout. The commands are well structured, and the designated environment variables (such as ESPRESSO_PSEUDO, OMPI_MCA_plm, etc.) appear tailored for the integration tests. Confirm that these environment variable settings align with the requirements of your notebooks, particularly for handling simulation inputs and MPI configuration.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1)
85-87: 🛠️ Refactor suggestion

Fix the pseudo_dir path to be more CI-friendly.

The path to the pseudopotentials directory is still using a relative path that may not work reliably in CI environments. This was mentioned in a previous review comment, but the updated path "tests/integration" could still cause issues.
            profile=EspressoProfile(
                 command="flux run pw.x",
-                pseudo_dir="tests/integration",
+                pseudo_dir=os.environ.get("ESPRESSO_PSEUDO", "tests/integration"),
            ),
Using an environment variable with a fallback value makes the code more portable across different environments.

🧹 Nitpick comments (6)

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (2)
187-201: Potential issue with relative paths in directory creation.

The directory creation for each strain calculation uses relative paths, which might lead to unexpected behavior depending on where the notebook is executed from.
-        os.makedirs(os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")), exist_ok=True)
+        strain_dir = os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\"))
+        os.makedirs(strain_dir, exist_ok=True)
         future_dict[k] = exe.submit(
             evaluate_with_quantum_espresso, 
             task_dict=v, 
             pseudopotentials=pseudopotentials, 
             kpts=(3, 3, 3), 
             resource_dict={
                 "cores": 1, 
                 "threads_per_core": 2, 
-                "cwd": os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")),
+                "cwd": strain_dir,
             },
         )
This refactoring reduces code duplication and makes the directory management more explicit.

1-302: Code logic is sound but could benefit from error handling.

The overall implementation for calculating the bulk modulus of aluminum using Quantum Espresso with Flux job executor works well. The notebook successfully:

Defines pseudopotentials

Generates structures with varying volumes

Sets up and executes the calculations

Analyzes results to obtain the bulk modulus

Visualizes the energy-volume curve

However, the notebook lacks error handling for potential failures in job execution or file I/O operations. Consider adding try-except blocks around critical operations.
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (4)
39-40: Consider adding pseudopotential source documentation.

The pseudopotential file specification is provided, but it would be helpful to include a comment or documentation reference about where this pseudopotential can be obtained and its properties.

74-90: Path hardcoding in pseudo_dir may cause portability issues.

The pseudo_dir parameter is hardcoded to "tests/integration", which might cause issues if the notebook is run from a different directory. Consider using a relative path or environment variable.
-            profile=EspressoProfile(
-                 command="pw.x",
-                 pseudo_dir="tests/integration",
-            ),
+            profile=EspressoProfile(
+                 command="pw.x",
+                 pseudo_dir=os.path.join(os.path.dirname(os.path.abspath(".")), "tests/integration"),
+            ),
257-260: Plot formatting could be improved with additional customization.

While the plot effectively displays the essential information, consider enhancing it with grid lines, improved font sizes, and a title to make it more publication-ready.
 plt.plot(fit_dict["volume"], fit_dict["energy"], label="$B_0$= %0.2f GPa" % fit_dict["bulkmodul_eq"])
 plt.xlabel(r"Volume [$\AA^3$]")
 plt.ylabel("Energy [eV]")
+plt.title("Al Bulk Modulus Calculation with Quantum Espresso")
+plt.grid(True, linestyle='--', alpha=0.7)
+plt.tight_layout()
 plt.legend()
1-293: Overall a well-structured and functional integration test notebook.

This notebook effectively demonstrates the workflow for calculating aluminum bulk modulus using Quantum Espresso with Flux cluster execution. It covers the essential steps: structure generation, task definition, parallel execution, result collection, analysis, and visualization. The code follows a logical flow that would be suitable for integration testing.

For integration testing purposes, consider adding assertion statements to validate the calculated bulk modulus against expected values within acceptable tolerances. This would make the notebook more effective as an automated test.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7db29e7 and b8daa95.

📒 Files selected for processing (2)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1 hunks)
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (6)

GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: notebooks_integration
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_win
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)

🔇 Additional comments (3)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (3)

1-31: Well-structured imports and introduction.

The notebook begins with a clear markdown title and imports all necessary libraries for atomic simulations, subprocess execution, and visualization components. The imports are comprehensive and well-organized.

49-54: Well-defined structure generation parameters.

The structure generation is clearly parameterized with appropriate values for aluminum, with a sensible number of points and volume range for accurate bulk modulus calculation.

211-216: Good use of analysis helper function with appropriate fit parameters.

The analysis using evcurve_analyse_structures with polynomial fit of order 3 is appropriate for bulk modulus calculations. The function parameters are well chosen.

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (4)
74-90: Well-structured evaluation function for Quantum Espresso.

The function is appropriately modular and encapsulates the Quantum Espresso calculation setup. Consider adding docstrings to document the expected format of inputs and outputs.

Consider adding a docstring to explain the parameters and return value:
def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
+    """
+    Evaluate atomic structures using Quantum Espresso.
+    
+    Parameters:
+        task_dict (dict): Dictionary containing atomic structures to evaluate
+        pseudopotentials (dict): Dictionary mapping elements to pseudopotential files
+        kpts (tuple): k-point mesh for the calculation
+        
+    Returns:
+        dict: Dictionary of calculated energies
+    """
    from ase.calculators.espresso import Espresso, EspressoProfile
    from atomistics.calculators import evaluate_with_ase
113-113: Resource monitoring is properly implemented.

The use of subprocess.check_output to monitor Flux resources and jobs is appropriate, but consider adding error handling for these calls.

Add try-except blocks to handle potential errors in the subprocess calls:
-pprint.pp(subprocess.check_output(["flux", "resource", "list"], universal_newlines=True).split("\n"))
+try:
+    resource_output = subprocess.check_output(["flux", "resource", "list"], universal_newlines=True)
+    pprint.pp(resource_output.split("\n"))
+except subprocess.CalledProcessError as e:
+    print(f"Error checking resources: {e}")
Also applies to: 131-131

257-260: Well-structured visualization code, but needs escape sequence fix.

The plotting code is concise and presents the key results effectively. Just fix the escape sequence as noted above.

Additionally, consider adding axis limits and improving the figure size for better visualization:
+plt.figure(figsize=(8, 6))
 plt.plot(fit_dict["volume"], fit_dict["energy"], label="$B_0$= %0.2f GPa" % fit_dict["bulkmodul_eq"])
-plt.xlabel("Volume [$\\AA^3$]")
+plt.xlabel(r"Volume [$\AA^3$]")
 plt.ylabel("Energy [eV]")
+plt.tight_layout()
 plt.legend()
211-216: Good use of standard fitting approach.

The use of polynomial fitting to extract the bulk modulus is a standard approach. Consider adding some error metrics or visualization of the fit quality.

Add code to visualize and evaluate the fit quality:
# Add after the fitting code
residuals = fit_dict["residuals"] if "residuals" in fit_dict else []
if residuals:
    print(f"Fit quality - Root Mean Square Error: {np.sqrt(np.mean(np.array(residuals)**2)):.6f} eV")
    
    # Optional: Plot residuals
    plt.figure(figsize=(8, 4))
    plt.scatter(fit_dict["volume"], residuals, label="Fit Residuals")
    plt.axhline(y=0, color='r', linestyle='--')
    plt.xlabel(r"Volume [$\AA^3$]")
    plt.ylabel("Residuals [eV]")
    plt.legend()
    plt.title("Fit Quality")
    plt.tight_layout()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8daa95 and 9f23e24.

📒 Files selected for processing (1)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (16)

GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
GitHub Check: unittest_win
GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
GitHub Check: unittest_flux_openmpi
GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
GitHub Check: unittest_flux_mpich
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: notebooks_integration
GitHub Check: unittest_old
GitHub Check: notebooks

🔇 Additional comments (3)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (3)

18-29: The imports look good, comprehensive set of required libraries.

The notebook includes all necessary imports for handling atomic simulations, parallel execution with Flux, and visualization. The organization is clean and follows best practices.

184-201: Consider adding error handling for job submission and monitoring.

The job submission and monitoring code lacks error handling. If a job fails or if the Flux cluster is not available, the notebook will crash without helpful error messages.
 future_dict = {}
 with FluxClusterExecutor() as exe:
-    for k, v in task_loop_dict.items():
-        future_dict[k] = exe.submit(
-            evaluate_with_quantum_espresso, 
-            task_dict=v, 
-            pseudopotentials=pseudopotentials, 
-            kpts=(3, 3, 3), 
-            resource_dict={"cores": 1, "threads_per_core": 2},
-        )
+    try:
+        for k, v in task_loop_dict.items():
+            future_dict[k] = exe.submit(
+                evaluate_with_quantum_espresso, 
+                task_dict=v, 
+                pseudopotentials=pseudopotentials, 
+                kpts=(3, 3, 3), 
+                resource_dict={"cores": 1, "threads_per_core": 2},
+            )
+    except Exception as e:
+        print(f"Error submitting jobs: {e}")
+        raise
228-233: Fix invalid escape sequence in the xlabel.

The SyntaxWarning about an invalid escape sequence '\A' in the xlabel should be fixed. Use a raw string prefix or double the backslash.
- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel(r"Volume [$\AA^3$]")

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1)
86-88: 🛠️ Refactor suggestion

Path configuration still uses hardcoded directory path.

The pseudopotential directory is still hardcoded to a specific path in the CI environment. This could cause issues when the notebook is run in different environments.
-                 pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+                 pseudo_dir=os.path.join(os.environ.get("PSEUDO_DIR", "/home/runner/work/executorlib/executorlib/tests/integration")),

🧹 Nitpick comments (4)

.github/workflows/pipeline.yml (1)

400-416: Consider adding the new job to the dependency chain for automation jobs.

The new notebooks_integration job is not included in the dependency list for the autobot and uml jobs. This means these jobs might run before the integration tests complete, potentially allowing merges even if the integration tests fail.
- needs: [unittest_old, unittest_win, unittest_openmpi, unittest_mpich, unittest_flux_openmpi, unittest_flux_mpich, notebooks, benchmark, minimal, pip_check, mypy]
+ needs: [unittest_old, unittest_win, unittest_openmpi, unittest_mpich, unittest_flux_openmpi, unittest_flux_mpich, notebooks, notebooks_integration, benchmark, minimal, pip_check, mypy]

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (3)

74-90: Add a docstring to the evaluate_with_quantum_espresso function.

The function is missing a docstring to describe its purpose, parameters, and return value. This would improve code readability and maintainability.

 def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
+    """
+    Evaluate atomic structures using Quantum Espresso calculator.
+    
+    Parameters:
+    -----------
+    task_dict : dict
+        Dictionary containing the structures to evaluate
+    pseudopotentials : dict
+        Dictionary mapping chemical symbols to pseudopotential file names
+    kpts : tuple
+        K-point mesh for the calculation
+        
+    Returns:
+    --------
+    dict
+        Dictionary containing calculated energies
+    """
     from ase.calculators.espresso import Espresso, EspressoProfile
     from atomistics.calculators import evaluate_with_ase
     
     return evaluate_with_ase(

67-91: Implement a more robust resource configuration approach.

The function doesn't handle resource specification in a flexible way. Consider adding parameters to control computation resources.

-def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
+def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts, resource_dict=None):
     """
     Evaluate atomic structures using Quantum Espresso calculator.
     
     Parameters:
     -----------
     task_dict : dict
         Dictionary containing the structures to evaluate
     pseudopotentials : dict
         Dictionary mapping chemical symbols to pseudopotential file names
     kpts : tuple
         K-point mesh for the calculation
+    resource_dict : dict, optional
+        Dictionary with computation resource parameters
         
     Returns:
     --------
     dict
         Dictionary containing calculated energies
     """
     from ase.calculators.espresso import Espresso, EspressoProfile
     from atomistics.calculators import evaluate_with_ase
     
+    # Default QE parameters
+    params = {
+        "tstress": True,
+        "tprnfor": True,
+        "kpts": kpts,
+    }
+    
+    # Add additional parameters from resource_dict
+    if resource_dict is not None:
+        if "conv_thr" in resource_dict:
+            params["conv_thr"] = resource_dict["conv_thr"]
+        if "ecutwfc" in resource_dict:
+            params["ecutwfc"] = resource_dict["ecutwfc"]
+    
     return evaluate_with_ase(
         task_dict=task_dict,
         ase_calculator=Espresso(
             pseudopotentials=pseudopotentials,
-            tstress=True,
-            tprnfor=True,
-            kpts=kpts,
+            **params,
             profile=EspressoProfile(
                 command="pw.x",
                 pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
             ),
         ),
     )["energy"]

211-216: Add error handling for the fitting process.

The analysis of structures is not wrapped in a try-except block, which might cause the notebook to crash if the fitting fails.

-fit_dict = evcurve_analyse_structures(
-    output_dict={"energy": result_dict},
-    structure_dict=structure_dict,
-    fit_type="polynomial",
-    fit_order=3,
-)
+try:
+    fit_dict = evcurve_analyse_structures(
+        output_dict={"energy": result_dict},
+        structure_dict=structure_dict,
+        fit_type="polynomial",
+        fit_order=3,
+    )
+except Exception as e:
+    print(f"Error during fitting process: {e}")
+    raise

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f23e24 and 9e60689.

📒 Files selected for processing (3)

.github/workflows/pipeline.yml (1 hunks)
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1 hunks)
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: unittest_win
GitHub Check: notebooks_integration
GitHub Check: notebooks

🔇 Additional comments (9)

.github/workflows/pipeline.yml (1)

176-204: Well-structured integration test job added to the workflow.

The new notebooks_integration job properly sets up the environment and executes the integration test notebooks for calculating aluminum bulk modulus using GPAW and Quantum Espresso. This aligns with the PR objective to add integration tests for application notebooks.
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (5)

74-90: The evaluate_with_quantum_espresso function contains a hardcoded path.

The function includes a hardcoded path to the pseudopotentials directory (pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration"), which won't work in environments other than the CI runner.

Use environment variables or a relative path for portability:
            profile=EspressoProfile(
                 command="flux run pw.x",
-                pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+                pseudo_dir=os.environ.get("ESPRESSO_PSEUDO", "pseudo"),
            ),
This will use the ESPRESSO_PSEUDO environment variable if defined, or fall back to a relative "pseudo" directory.

265-266: Fix the invalid escape sequence warning in matplotlib label.

The current string format is causing a SyntaxWarning for an invalid escape sequence '\A'. Use a raw string with the correct escape sequence for the Angstrom symbol.
-plt.xlabel("Volume [$\\AA^3$]")
+plt.xlabel(r"Volume [$\AA^3$]")
This change will properly render the Angstrom symbol without generating warnings.

49-54: LGTM: Appropriate test structure generation parameters

The structure generation for aluminum using ASE's bulk function with reasonable volume range and number of points is well-configured for testing the bulk modulus calculation workflow.

187-209: Robust job execution using FluxJobExecutor

Good implementation of the FluxJobExecutor with proper resource specification and directory management. The code correctly submits tasks for evaluation and collects results asynchronously with progress tracking.

219-224: Effective results analysis

The notebook correctly uses the atomistics workflow helper functions to analyze the calculation results and fit the energy vs. volume data to determine the bulk modulus.
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (3)

184-201: Add error handling for job submission and monitoring.

The job submission and monitoring code lacks error handling. If a job fails or if the Flux cluster is not available, the notebook will crash without helpful error messages.
 future_dict = {}
 with FluxClusterExecutor() as exe:
-    for k, v in task_loop_dict.items():
-        future_dict[k] = exe.submit(
-            evaluate_with_quantum_espresso, 
-            task_dict=v, 
-            pseudopotentials=pseudopotentials, 
-            kpts=(3, 3, 3), 
-            resource_dict={"cores": 1, "threads_per_core": 2},
-        )
+    try:
+        for k, v in task_loop_dict.items():
+            future_dict[k] = exe.submit(
+                evaluate_with_quantum_espresso, 
+                task_dict=v, 
+                pseudopotentials=pseudopotentials, 
+                kpts=(3, 3, 3), 
+                resource_dict={"cores": 1, "threads_per_core": 2},
+            )
+    except Exception as e:
+        print(f"Error submitting jobs: {e}")
+        raise
196-199: Add error handling for result collection.

The result collection process should be wrapped in a try-except block to gracefully handle any failures during job execution.
-    result_dict = {
-        k: f.result() 
-        for k, f in tqdm(future_dict.items())
-    }
+    result_dict = {}
+    try:
+        for k, f in tqdm(future_dict.items()):
+            try:
+                result_dict[k] = f.result()
+            except Exception as e:
+                print(f"Error getting result for job {k}: {e}")
+    except Exception as e:
+        print(f"Error collecting results: {e}")
+        if not result_dict:
+            raise  # Only re-raise if we got no results at all
228-233: Fix invalid escape sequence in the xlabel.

The SyntaxWarning about an invalid escape sequence '\A' in the xlabel should be fixed. Use a raw string prefix or double the backslash.
- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel(r"Volume [$\AA^3$]")

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

.github/workflows/pipeline.yml (2)

180-184: Checkout and Environment Merge Step
The steps to check out the repository and create a basic Conda configuration (via echoing the channels) are clear. If additional environment parameters are needed later, consider enhancing this script.

197-204: Notebook Execution Step
The execution phase uses flux start papermill to run the designated notebooks with an extended timeout (20 minutes) suitable for longer-running tasks. Note that the command for the notebook 5-3-flux-cluster-qe-bulk-modulus.ipynb is commented out. If this is intentional (e.g., due to pending issues or deprecation), consider adding an inline comment to explain why it’s disabled; otherwise, removing the dead code could tidy up the configuration.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e60689 and c0f8ff4.

📒 Files selected for processing (1)

.github/workflows/pipeline.yml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: unittest_win
GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: unittest_flux_openmpi
GitHub Check: notebooks

🔇 Additional comments (3)

.github/workflows/pipeline.yml (3)

176-179: New Integration Job Setup
The new notebooks_integration job is introduced to run integration tests for the notebooks. It correctly declares its dependency on the black job and specifies the appropriate runner (ubuntu-latest).

185-190: Mambaforge Setup Step
The configuration for setting up Mambaforge is well defined—using Python 3.12, the latest miniforge, and referencing the integration-specific environment file (.ci_support/environment-integration.yml). This aligns with the integration testing objective.

191-196: Install Dependencies Step
The installation step, which installs versioneer[toml]==0.29 and the current package (without dependencies or build isolation), is consistent with other parts of the pipeline. This ensures that the environment is ready to execute the notebooks.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

.github/workflows/pipeline.yml (1)

176-204: Review of the Newly Added "notebooks_integration" Job

Dependencies & Environment Setup:
The job correctly depends on the black job and uses the standard ubuntu-latest runner. Ensure that the environment file (.ci_support/environment-integration.yml) referenced in line 190 includes all the necessary dependencies (e.g., papermill, flux, and any other notebook-specific packages) required for execution.

Merge Notebook Environment Step:
The step at lines 180–184 creates a simple .condarc that specifies the conda-forge channel. This is consistent with similar steps in other jobs. Just verify that no additional channels or configuration are needed for integration tests.

Setup Mambaforge Action:
The configuration of the conda-incubator/setup-miniconda@v3 action (lines 184–190) uses Python 3.12 and points to the integration environment file. This appears correct; however, double-check that the miniforge-version: latest will consistently pick the intended version in your CI environment.

Installation Step:
The installation commands at lines 192–195 install versioneer and the package under test without dependencies or build isolation. This matches the patterns in other jobs and should be fine as long as the repository’s metadata is up-to-date.

Notebook Execution Step:
The execution block (lines 197–203) runs four notebook commands using flux start papermill. Please verify that:

Using flux start papermill for all four notebooks is intentional. (Recall the PR objective mentions two different execution modes; if there is a need to differentiate between cluster and job executor modes, adjust the commands accordingly or add clarifying comments.)

The 20-minute timeout (line 198) is adequate for the notebooks’ execution under integration conditions.

Overall Clarity & Consistency:
The newly added job is structured consistently with the existing jobs. Just ensure that any nuances related to notebook execution modes (i.e., cluster versus job executor as described in the PR objectives) are either addressed here or documented elsewhere, to avoid any confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0f8ff4 and 74b8710.

📒 Files selected for processing (1)

.github/workflows/pipeline.yml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (7)

GitHub Check: unittest_win
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: notebooks
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: notebooks_integration

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1)
86-88: 🛠️ Refactor suggestion

Update the pseudopotential directory path

The pseudopotential directory is hardcoded to a specific path related to a GitHub runner, which might cause issues in different environments. This was already flagged in a previous review and reportedly fixed in commit 9e60689, but the path should be made configurable or relative for better portability.
-                 pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+                 pseudo_dir=os.path.join(os.path.dirname(os.path.abspath("__file__")), "..", "tests", "integration"),

🧹 Nitpick comments (4)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (4)

18-29: Add documentation for the imports

The import section lacks documentation explaining the purpose of each imported library. Adding comments would improve maintainability.

Consider adding documentation comments for each import group:

 import os
 import subprocess
+# ASE for atomic structure manipulation
 from ase.build import bulk
+# Atomistics helpers for energy-volume curve analysis
 from atomistics.workflows.evcurve.helper import (
     analyse_structures_helper as evcurve_analyse_structures,
     generate_structures_helper as evcurve_generate_structures,
 )
+# Parallel execution via Flux
 from executorlib import FluxClusterExecutor
+# Visualization and utilities
 import matplotlib.pyplot as plt
 import pprint
 from tqdm import tqdm
 from time import sleep

39-39: Consider adding a comment about the pseudopotential definition

The pseudopotential definition lacks an explanation of what it is and where it comes from. This would help with reproducibility.

Add a comment explaining the pseudopotential:

+# PBE pseudopotential for Aluminum from the pslibrary
 pseudopotentials = {"Al": "Al.pbe-n-kjpaw_psl.1.0.0.UPF"}

257-260: Consider adding more informative plot labels and title

The visualization could be improved with a more descriptive title and axis labels that include units.

Enhance the plot with more informative labels:

 plt.plot(fit_dict["volume"], fit_dict["energy"], label="$B_0$= %0.2f GPa" % fit_dict["bulkmodul_eq"])
 plt.xlabel(r"Volume [$\AA^3$]")
 plt.ylabel("Energy [eV]")
+plt.title("Energy vs. Volume Curve for Aluminum")
+plt.grid(True, linestyle='--', alpha=0.7)
 plt.legend()

74-90: Consider parameterizing calculation settings

The function hardcodes several calculation parameters that could be made configurable through additional function parameters.

Make the calculation settings more configurable:

-def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
+def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts, pseudo_dir=None, 
+                                 tstress=True, tprnfor=True, command="flux run pw.x"):
     from ase.calculators.espresso import Espresso, EspressoProfile
     from atomistics.calculators import evaluate_with_ase
     
+    if pseudo_dir is None:
+        pseudo_dir = os.path.join(os.path.dirname(os.path.abspath("__file__")), "..", "tests", "integration")
+    
     return evaluate_with_ase(
         task_dict=task_dict,
         ase_calculator=Espresso(
             pseudopotentials=pseudopotentials,
-            tstress=True,
-            tprnfor=True,
+            tstress=tstress,
+            tprnfor=tprnfor,
             kpts=kpts,
             profile=EspressoProfile(
-                 command="flux run pw.x",
-                 pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+                 command=command,
+                 pseudo_dir=pseudo_dir,
             ),
         ),
     )["energy"]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74b8710 and 5140389.

📒 Files selected for processing (1)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (17)

GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: minimal
GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
GitHub Check: notebooks_integration
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_win
GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
GitHub Check: notebooks
GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
GitHub Check: unittest_old
GitHub Check: unittest_flux_openmpi
GitHub Check: unittest_flux_mpich

🔇 Additional comments (3)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (3)

184-193: Add error handling for job submission

The job submission code lacks error handling. If there are issues with the job submission or the Flux cluster is unavailable, the notebook will fail without a helpful error message.

Apply the previously suggested fix:

 future_dict = {}
 with FluxClusterExecutor() as exe:
-    for k, v in task_loop_dict.items():
-        future_dict[k] = exe.submit(
-            evaluate_with_quantum_espresso, 
-            task_dict=v, 
-            pseudopotentials=pseudopotentials, 
-            kpts=(3, 3, 3), 
-            resource_dict={"cores": 1, "threads_per_core": 2},
-        )
+    try:
+        for k, v in task_loop_dict.items():
+            future_dict[k] = exe.submit(
+                evaluate_with_quantum_espresso, 
+                task_dict=v, 
+                pseudopotentials=pseudopotentials, 
+                kpts=(3, 3, 3), 
+                resource_dict={"cores": 1, "threads_per_core": 2},
+            )
+    except Exception as e:
+        print(f"Error submitting jobs: {e}")
+        raise

196-199: Add error handling for result collection

The result collection process needs error handling to gracefully handle failures during job execution.

Apply the previously suggested fix:

-    result_dict = {
-        k: f.result() 
-        for k, f in tqdm(future_dict.items())
-    }
+    result_dict = {}
+    try:
+        for k, f in tqdm(future_dict.items()):
+            try:
+                result_dict[k] = f.result()
+            except Exception as e:
+                print(f"Error getting result for job {k}: {e}")
+    except Exception as e:
+        print(f"Error collecting results: {e}")
+        if not result_dict:
+            raise  # Only re-raise if we got no results at all

228-232: Fix invalid escape sequence in the xlabel

The code triggers a SyntaxWarning about an invalid escape sequence '\A' in the xlabel.

Fix the escape sequence by using a raw string:

- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel(r"Volume [$\AA^3$]")

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (2)
74-90: ⚠️ Potential issue

Hardcoded path in evaluate_with_quantum_espresso function

The function includes a hardcoded path to the pseudopotentials directory that won't work in environments other than the CI environment.

Use an environment variable or a relative path for portability:
            profile=EspressoProfile(
                 command="flux run -n 2 pw.x",
-                pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+                pseudo_dir=os.environ.get("ESPRESSO_PSEUDO", "pseudo"),
            ),
265-266: ⚠️ Potential issue

Fix the invalid escape sequence warning in matplotlib label

The current string format is causing a SyntaxWarning for an invalid escape sequence '\A'. Use a raw string with the correct escape sequence for the Angstrom symbol.
-plt.xlabel("Volume [$\\AA^3$]")
+plt.xlabel(r"Volume [$\AA^3$]")

🧹 Nitpick comments (3)

.github/workflows/pipeline.yml (2)
400-401: Update autobot dependencies to include the new integration job

The autobot job should depend on the new notebooks_integration job to ensure PRs aren't automatically merged before all tests pass.
- needs: [unittest_old, unittest_win, unittest_openmpi, unittest_mpich, unittest_flux_openmpi, unittest_flux_mpich, notebooks, benchmark, minimal, pip_check, mypy]
+ needs: [unittest_old, unittest_win, unittest_openmpi, unittest_mpich, unittest_flux_openmpi, unittest_flux_mpich, notebooks, notebooks_integration, benchmark, minimal, pip_check, mypy]
198-198: Consider the timeout duration for integration notebooks

The timeout is set to 20 minutes, which is significantly longer than the 5 minutes for regular notebooks. Make sure this is sufficient but not excessive for running these computational notebooks.
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1)
187-209: Consider simplifying the flux job execution and monitoring

The notebook has verbose output from flux jobs and resources commands. For better readability, you might want to consolidate the monitoring into a single cell or create a helper function.

Example of a monitoring helper function:
def monitor_flux_jobs():
    """Helper function to monitor flux jobs and resources"""
    print("Flux resources:")
    print(subprocess.check_output(["flux", "resource", "list"], universal_newlines=True))
    print("\nFlux jobs:")
    print(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True))

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5140389 and d07609a.

📒 Files selected for processing (3)

.github/workflows/pipeline.yml (1 hunks)
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1 hunks)
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (6)

GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: unittest_win
GitHub Check: notebooks_integration
GitHub Check: notebooks

🔇 Additional comments (7)

.github/workflows/pipeline.yml (2)

176-204: New integration job for running application notebooks looks well-structured

The addition of the notebooks_integration job follows the established pattern in the workflow and provides a good way to test the new application notebooks as part of CI.

202-202:
❓ Verification inconclusive

Verify if this notebook is intentionally commented out

Line 202 has a commented-out notebook execution for 5-3-flux-cluster-qe-bulk-modulus.ipynb. Is this intentional, or should it be enabled?

🏁 Script executed:
#!/bin/bash
# Check if the notebook file exists
test -f notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb && echo "Notebook exists but is commented out in the workflow" || echo "Notebook doesn't exist, so commenting it out is correct"
Length of output: 173
Action Required: Confirm the Intent for the Commented Notebook Execution

The shell script confirms that notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb exists, yet its execution command remains commented out in .github/workflows/pipeline.yml (line 202). Please verify if this is intentional. If the execution should remain disabled, consider adding a clarifying comment to explain the rationale. Otherwise, if it was commented out by oversight, please enable it.
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1)

1-301: Well-structured notebook for calculating aluminum bulk modulus

The notebook is well-organized with clear separation between setup, computation, and visualization. It correctly uses the FluxJobExecutor for parallel computation and properly visualizes the results.
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (4)

87-88: Update the pseudopotential directory path.

The hardcoded pseudopotential directory path could cause issues in different environments. Consider using a relative path or environment variable to make the notebook more portable.

This issue was identified in a previous review and marked as addressed in commit 9e60689, but the code still contains the hardcoded path. Please verify this is the intended path for the integration environment.

184-193: Add error handling for job submission.

The job submission code lacks error handling. If a job fails or if the Flux cluster is not available, the notebook will crash without helpful error messages.

Consider implementing the previously suggested error handling:
 future_dict = {}
 with FluxClusterExecutor() as exe:
-    for k, v in task_loop_dict.items():
-        future_dict[k] = exe.submit(
-            evaluate_with_quantum_espresso, 
-            task_dict=v, 
-            pseudopotentials=pseudopotentials, 
-            kpts=(3, 3, 3), 
-            resource_dict={"cores": 1, "threads_per_core": 2},
-        )
+    try:
+        for k, v in task_loop_dict.items():
+            future_dict[k] = exe.submit(
+                evaluate_with_quantum_espresso, 
+                task_dict=v, 
+                pseudopotentials=pseudopotentials, 
+                kpts=(3, 3, 3), 
+                resource_dict={"cores": 1, "threads_per_core": 2},
+            )
+    except Exception as e:
+        print(f"Error submitting jobs: {e}")
+        raise
196-199: Add error handling for result collection.

The result collection process should be wrapped in a try-except block to gracefully handle any failures during job execution.

Implement the previously suggested error handling:
-    result_dict = {
-        k: f.result() 
-        for k, f in tqdm(future_dict.items())
-    }
+    result_dict = {}
+    try:
+        for k, f in tqdm(future_dict.items()):
+            try:
+                result_dict[k] = f.result()
+            except Exception as e:
+                print(f"Error getting result for job {k}: {e}")
+    except Exception as e:
+        print(f"Error collecting results: {e}")
+        if not result_dict:
+            raise  # Only re-raise if we got no results at all
257-260: Fix invalid escape sequence in the xlabel.

The warning message in the output (lines 229-233) indicates an invalid escape sequence '\A' in the xlabel. This should be fixed to prevent warnings during execution.

Use a raw string prefix for the xlabel:
- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel(r"Volume [$\AA^3$]")

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (3)

184-206: ⚠️ Potential issue

Error handling still needed for job submission and monitoring.

The job submission code still lacks error handling. The previously suggested improvement to wrap the job submission and monitoring code in try-except blocks has not been implemented. If a job fails or the Flux cluster is unavailable, the notebook will crash without helpful error messages.

Apply a similar fix as suggested in the previous review:

 future_dict = {}
 with FluxClusterExecutor(flux_executor_nesting=True) as exe:
-    for k, v in task_loop_dict.items():
-        os.makedirs(os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")), exist_ok=True)
-        future_dict[k] = exe.submit(
-            evaluate_with_quantum_espresso, 
-            task_dict=v, 
-            pseudopotentials=pseudopotentials, 
-            kpts=(3, 3, 3), 
-            resource_dict={
-                \"cores\": 1,
-                \"threads_per_core\": 2,
-                \"cwd\": os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")),
-            },
-        )
+    try:
+        for k, v in task_loop_dict.items():
+            os.makedirs(os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")), exist_ok=True)
+            future_dict[k] = exe.submit(
+                evaluate_with_quantum_espresso, 
+                task_dict=v, 
+                pseudopotentials=pseudopotentials, 
+                kpts=(3, 3, 3), 
+                resource_dict={
+                    \"cores\": 1,
+                    \"threads_per_core\": 2,
+                    \"cwd\": os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")),
+                },
+            )
+    except Exception as e:
+        print(f"Error submitting jobs: {e}")
+        raise

201-204: ⚠️ Potential issue

Add error handling for result collection as recommended previously.

The result collection process should be wrapped in a try-except block to gracefully handle any failures during job execution, as suggested in the previous review but not implemented.

Apply the previously suggested fix:

-    result_dict = {
-        k: f.result() 
-        for k, f in tqdm(future_dict.items())
-    }
+    result_dict = {}
+    try:
+        for k, f in tqdm(future_dict.items()):
+            try:
+                result_dict[k] = f.result()
+            except Exception as e:
+                print(f"Error getting result for job {k}: {e}")
+    except Exception as e:
+        print(f"Error collecting results: {e}")
+        if not result_dict:
+            raise  # Only re-raise if we got no results at all

263-263: ⚠️ Potential issue

Fix invalid escape sequence in the xlabel.

The SyntaxWarning about an invalid escape sequence '\A' in the xlabel is still present (as shown in lines 233-238 of the output). This should be fixed as suggested in the previous review.

Apply the previously suggested fix:

- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel(r"Volume [$\AA^3$]")

🧹 Nitpick comments (4)

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (4)
74-90: Add docstring to the evaluation function.

The evaluate_with_quantum_espresso function lacks a docstring explaining its purpose, parameters, and return values. This would improve the notebook's maintainability and readability.

Add a docstring to provide better context:
 def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
+    """
+    Evaluate structures using Quantum Espresso calculator.
+    
+    Parameters:
+    -----------
+    task_dict : dict
+        Dictionary containing atomic structures to evaluate
+    pseudopotentials : dict
+        Dictionary mapping element symbols to pseudopotential files
+    kpts : tuple
+        K-point mesh for the calculation (e.g., (3, 3, 3))
+    
+    Returns:
+    --------
+    dict
+        Dictionary of calculated energies
+    """
     from ase.calculators.espresso import Espresso, EspressoProfile
     from atomistics.calculators import evaluate_with_ase
87-88: Consider using a configurable path for pseudopotentials.

The path to pseudopotential files is hardcoded, which could cause problems when running in different environments. Making it configurable would improve portability.

Consider modifying the code to make the pseudopotential directory configurable:
+    # Get the pseudopotential path from environment variable or use default
+    pseudo_dir = os.environ.get(
+        "QE_PSEUDO_DIR", 
+        "/home/runner/work/executorlib/executorlib/tests/integration"
+    )
     return evaluate_with_ase(
         task_dict=task_dict,
         ase_calculator=Espresso(
             pseudopotentials=pseudopotentials,
             tstress=True,
             tprnfor=True,
             kpts=kpts,
             profile=EspressoProfile(
                  command="flux run pw.x",
-                 pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+                 pseudo_dir=pseudo_dir,
             ),
         ),
     )["energy"]
49-54: Consider adding comments to explain structure generation parameters.

The structure generation parameters (number of points, volume range) lack explanation. Adding comments would make it clearer why these specific values were chosen.

Consider adding comments to explain the parameter choices:
 structure_dict = evcurve_generate_structures(
     structure=bulk("Al", a=4.15, cubic=True),
-    num_points=7,
-    vol_range=0.05,
+    num_points=7,  # Number of volume points to sample
+    vol_range=0.05,  # ±5% volume variation around equilibrium
     axes=("x", "y", "z"),
 )
192-192: Consider making k-point sampling configurable.

The k-point mesh is hardcoded to (3, 3, 3), which may not be optimal for all use cases. Making it configurable would improve flexibility.

Consider defining the k-point grid as a variable at the beginning of the notebook to make it easier to adjust:
+# Define computational parameters
+kpoint_grid = (3, 3, 3)  # k-point grid for QE calculations
+
+# ... (later in the code)
+
 future_dict[k] = exe.submit(
     evaluate_with_quantum_espresso, 
     task_dict=v, 
     pseudopotentials=pseudopotentials, 
-    kpts=(3, 3, 3), 
+    kpts=kpoint_grid,
     resource_dict={
         "cores": 1,
         "threads_per_core": 2,

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d07609a and 0d1ad0b.

📒 Files selected for processing (3)

.github/workflows/pipeline.yml (1 hunks)
notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb (1 hunks)
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: unittest_win

🔇 Additional comments (6)

.github/workflows/pipeline.yml (1)

176-204: Well-structured job for integration testing of application notebooks

The added notebooks_integration job follows the project's established patterns and properly sets up the environment to run the new bulk modulus calculation notebooks. The job has appropriate timeouts and dependencies.
notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (5)

74-90: Hard-coded path for pseudopotentials may cause issues in different environments

The evaluate_with_quantum_espresso function contains a hard-coded path to the pseudopotentials directory (pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration"), which might not work in other environments.

Consider using environment variables or a relative path for better portability:
        profile=EspressoProfile(
             command="flux run pw.x",
-            pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+            pseudo_dir=os.environ.get("ESPRESSO_PSEUDO", "pseudo"),
        ),
265-266: Fix the invalid escape sequence warning in matplotlib label

The current string format is causing a SyntaxWarning for an invalid escape sequence '\A'. Use a raw string with the correct escape sequence for the Angstrom symbol.
-plt.xlabel("Volume [$\\AA^3$]")
+plt.xlabel(r"Volume [$\AA^3$]")
17-30: LGTM: Appropriate imports for the notebook's purpose

The notebook imports necessary libraries for atomic simulations, job execution management, and result visualization.

187-210: Well-structured parallel execution pattern using FluxJobExecutor

The implementation correctly uses FluxJobExecutor to submit multiple strain calculation tasks in parallel, while properly handling directory creation for each task and monitoring job progress.

219-224: Good use of the analysis helper for bulk modulus calculation

The implementation correctly uses the evcurve_analyse_structures function to process the calculation results and extract the bulk modulus value.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1)
86-88: 🛠️ Refactor suggestion

The pseudo_dir path is still hardcoded.

While the path has been updated from the previous personal path to a CI-compatible path, it's still hardcoded which could cause issues in different environments. This approach isn't portable across different systems.

Consider using an environment variable or a relative path:
            profile=EspressoProfile(
                 command="flux run pw.x",
-                pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
+                pseudo_dir=os.environ.get("ESPRESSO_PSEUDO", "./pseudo"),
            ),
This makes the code more portable and allows for configuration through environment variables.

🧹 Nitpick comments (4)

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (4)

196-200: Resource allocation is fixed and may not be optimal for all environments.

The current implementation uses fixed values for cores and threads which might not be optimal in all execution environments.

Consider making the resource allocation configurable:

            resource_dict={
-                "cores": 1, 
-                "threads_per_core": 2, 
+                "cores": int(os.environ.get("QE_CORES", 1)), 
+                "threads_per_core": int(os.environ.get("QE_THREADS_PER_CORE", 2)), 
                "cwd": os.path.abspath(("strain_%0.2f" % k).replace(".", "_")),
            },

This allows users to adjust resource allocation through environment variables without modifying the code.

202-203: Use of fixed sleep(1) for synchronization.

Using a fixed sleep time for synchronization between job submission and status checking can be unreliable, especially in busy systems.

Consider implementing a more robust synchronization mechanism:

-    sleep(1)
-    pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n"))
+    # Wait for job submission to register
+    max_retries = 5
+    for retry in range(max_retries):
+        try:
+            job_status = subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n")
+            if len(job_status) > 1:  # More than just the header line
+                pprint.pp(job_status)
+                break
+        except subprocess.SubprocessError:
+            pass
+        sleep(0.5)

This approach retries the status check with exponential backoff, making it more resilient to timing variations.

74-90: Refactor evaluate_with_quantum_espresso to enhance reusability.

The function has limited flexibility because it doesn't allow customizing some important parameters of the Espresso calculator.

Consider making the function more configurable by accepting additional parameters:

-def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
+def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts, command="flux run pw.x", 
+                                   pseudo_dir=None, additional_parameters=None):
    from ase.calculators.espresso import Espresso, EspressoProfile
    from atomistics.calculators import evaluate_with_ase
    
+    if pseudo_dir is None:
+        pseudo_dir = os.environ.get("ESPRESSO_PSEUDO", "./pseudo")
+    
+    espresso_params = {
+        "pseudopotentials": pseudopotentials,
+        "tstress": True,
+        "tprnfor": True,
+        "kpts": kpts,
+    }
+    
+    if additional_parameters:
+        espresso_params.update(additional_parameters)
    
    return evaluate_with_ase(
        task_dict=task_dict,
-        ase_calculator=Espresso(
-            pseudopotentials=pseudopotentials,
-            tstress=True,
-            tprnfor=True,
-            kpts=kpts,
-            profile=EspressoProfile(
-                 command="flux run pw.x",
-                 pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
-            ),
-        ),
+        ase_calculator=Espresso(
+            **espresso_params,
+            profile=EspressoProfile(
+                command=command,
+                pseudo_dir=pseudo_dir,
+            ),
+        ),
    )["energy"]

This refactoring makes the function more flexible, allowing users to customize the behavior without modifying the function itself.

190-201: Consider error handling for job submission and execution.

The current implementation doesn't handle potential errors during job submission or execution, which could lead to difficult-to-debug issues.

Add error handling to make the code more robust:

-        future_dict[k] = exe.submit(
-            evaluate_with_quantum_espresso, 
-            task_dict=v, 
-            pseudopotentials=pseudopotentials, 
-            kpts=(3, 3, 3), 
-            resource_dict={
-                "cores": 1, 
-                "threads_per_core": 2, 
-                "cwd": os.path.abspath(("strain_%0.2f" % k).replace(".", "_")),
-            },
-        )
+        try:
+            future_dict[k] = exe.submit(
+                evaluate_with_quantum_espresso, 
+                task_dict=v, 
+                pseudopotentials=pseudopotentials, 
+                kpts=(3, 3, 3), 
+                resource_dict={
+                    "cores": 1, 
+                    "threads_per_core": 2, 
+                    "cwd": os.path.abspath(("strain_%0.2f" % k).replace(".", "_")),
+                },
+            )
+        except Exception as e:
+            print(f"Error submitting job for strain {k}: {str(e)}")

Similarly, add error handling around the result collection to gracefully handle failed jobs.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d1ad0b and 4ba5f59.

📒 Files selected for processing (1)

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (13)

GitHub Check: notebooks_integration
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
GitHub Check: unittest_flux_openmpi
GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
GitHub Check: unittest_win
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: unittest_flux_mpich
GitHub Check: notebooks
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)

🔇 Additional comments (2)

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb (2)

265-266: Fix the invalid escape sequence warning in matplotlib label.

The current string format is causing a SyntaxWarning for an invalid escape sequence '\A'. This issue was identified in a previous review but hasn't been fixed.

Apply this fix to properly render the Angstrom symbol without generating warnings:
-plt.xlabel("Volume [$\\AA^3$]")
+plt.xlabel(r"Volume [$\AA^3$]")
1-31: LGTM: Well-structured imports and setup

The imports section is well-organized, importing all necessary libraries for the computation task. The code properly separates different components like structure generation, calculation setup, job execution, and analysis.

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (6)

docs/_toc.yml (1)

1-8: Contextual Consistency Check

The overall structure of the TOC remains consistent with earlier chapters. Ensure that the new chapter’s content in application.md includes sufficient context (e.g., description “Calculate the bulk modulus”) to guide users.

Also applies to: 13-15

.github/workflows/pipeline.yml (1)

176-202: Integration Job for Notebooks Added Successfully

The new notebooks_integration job is properly defined:

It correctly uses actions/checkout@v4 and sets up the environment using Mambaforge with Python 3.12.

The environment file .ci_support/environment-integration.yml is specified, which distinguishes it from the other notebook jobs.

The steps for installing dependencies and executing the notebooks with flux start papermill ... are clear and consistent with the CI/CD pipeline practices.

A couple of minor suggestions:

Logging and Debugging: Consider adding additional logging or enabling verbose output in the notebook execution step to facilitate easier diagnosis if failures occur during integration tests.

Environment Verification: It might be useful to double-check that the .ci_support/environment-integration.yml file includes all dependencies needed for running the GPAW and Quantum Espresso notebooks.

Overall, the changes integrate well into the existing pipeline structure.
notebooks/5-1-gpaw.ipynb (1)
183-201: Well-implemented job submission and management with FluxClusterExecutor.

The implementation properly handles job submission, monitoring progress with tqdm, and collecting results. However, there's no error handling for potential job failures.

Consider adding try/except blocks to handle potential job failures:
- result_dict = {
-     k: f.result()[-1] 
-     for k, f in tqdm(future_dict.items())
- }
+ result_dict = {}
+ for k, f in tqdm(future_dict.items()):
+     try:
+         result_dict[k] = f.result()[-1]
+     except Exception as e:
+         print(f"Job for {k} failed with error: {e}")
notebooks/5-2-quantum-espresso.ipynb (3)

38-54: Consider externalizing pseudo_dir path.
The evaluate_with_quantum_espresso function hard-codes the pseudo_dir path (/home/runner/work/executorlib/executorlib/tests/integration). For improved portability and maintainability, consider using an environment variable or a configuration setting instead of a hard-coded path.

278-279: Avoid repeating plotting code in consecutive cells.
Cells at lines 278-279 and 461-462 produce similarly structured plots. Factor out the common plotting steps into a helper function or a single cell to maintain DRY (Don’t Repeat Yourself) principles.

Also applies to: 461-462

32-36: Add docstring for the evaluate_with_quantum_espresso function.
It's a key method, but it lacks a docstring explaining parameters (like task_dict, pseudopotentials, kpts) and the return structure. A docstring will improve clarity and maintainability.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ba5f59 and 1f444fb.

📒 Files selected for processing (5)

.github/workflows/pipeline.yml (1 hunks)
docs/_toc.yml (1 hunks)
docs/application.md (1 hunks)
notebooks/5-1-gpaw.ipynb (1 hunks)
notebooks/5-2-quantum-espresso.ipynb (1 hunks)

✅ Files skipped from review due to trivial changes (1)

docs/application.md

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: notebooks
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: notebooks_integration
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)

🔇 Additional comments (7)

docs/_toc.yml (1)

9-12: New Application Chapter Added to TOC

The new chapter "application.md" with its sections for 5-1-gpaw.ipynb and 5-2-quantum-espresso.ipynb is clearly added and follows the established YAML structure. The indentation and syntax appear correct.

notebooks/5-1-gpaw.ipynb (5)

28-28: Well-organized imports for this computational notebook.

The imports cover all the necessary libraries for this materials science workflow, including system utilities, ASE for atomic structures, plotting tools, and monitoring capabilities.

38-47: Good implementation of the GPAW evaluation function.

The function is well-designed with clear parameters and efficiently sets up the GPAW calculator with appropriate settings. This makes it reusable for both executor types later in the notebook.

57-62: Structure generation is concisely implemented.

The code efficiently generates aluminum structures with varying volumes for the equation of state calculation using the helper function.

95-114: Useful diagnostic output for Flux resource monitoring.

These commands provide important visibility into the cluster resources and job status, which is valuable for understanding the execution environment.

1-471: Well-structured notebook for GPAW materials science calculations.

The notebook effectively demonstrates how to:

Set up computational materials science workflows using GPAW

Leverage Flux executors for parallel task execution

Analyze and visualize energy-volume relationships

Some suggestions for further improvement:

Add more explanatory markdown cells describing the scientific background and expected outcomes

Parameterize hardcoded values (like kpoints and energy cutoff) for easier experimentation

Add validation checks for calculation results

Overall, this is a valuable integration test that demonstrates the capability to perform DFT calculations with proper job management.

notebooks/5-2-quantum-espresso.ipynb (1)

64-64: Request verification of pseudopotential coverage.
Currently, pseudopotentials includes only "Al" in the dictionary. If additional elements are likely, ensure you dynamically manage or verify coverage to avoid unknown element lookups later.

Would you like to run a shell script to search for additional element usage or references to “pseudopotentials” across the codebase?

coderabbitai · 2025-03-28T15:55:16Z

notebooks/5-1-gpaw.ipynb

+    "with FluxJobExecutor() as exe:\n",
+    "    for k, v in task_loop_dict.items():\n",
+    "        future_dict[k] = exe.submit(\n",
+    "            evaluate_with_gpaw, \n",
+    "            task_dict=v, \n",
+    "            kpts=(3, 3, 3), \n",
+    "            encut=300,\n",
+    "            resource_dict={\"cores\": 2},\n",
+    "        )\n",
+    "    sleep(1)\n",
+    "    pprint.pp(subprocess.check_output([\"flux\", \"jobs\", \"-a\"], universal_newlines=True).split(\"\\n\"))\n",
+    "    result_dict = {\n",
+    "        k: f.result()[-1] \n",
+    "        for k, f in tqdm(future_dict.items())\n",
+    "    }\n",
+    "    sleep(1)\n",
+    "    pprint.pp(subprocess.check_output([\"flux\", \"jobs\", \"-a\"], universal_newlines=True).split(\"\\n\"))"


🛠️ Refactor suggestion

Code duplication with the FluxClusterExecutor section.

This section repeats the same pattern as the earlier FluxClusterExecutor section with minimal differences.

Consider refactoring the job execution logic into a reusable function:

def run_gpaw_jobs(executor_class, task_dict, kpts=(3, 3, 3), encut=300, cores=2): """Run GPAW calculations using the specified executor.""" future_dict = {} with executor_class() as exe: for k, v in task_dict.items(): future_dict[k] = exe.submit( evaluate_with_gpaw, task_dict=v, kpts=kpts, encut=encut, resource_dict={"cores": cores}, ) sleep(1) pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n")) result_dict = {} for k, f in tqdm(future_dict.items()): try: result_dict[k] = f.result()[-1] except Exception as e: print(f"Job for {k} failed with error: {e}") sleep(1) pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n")) return result_dict # Then use it like: # result_dict = run_gpaw_jobs(FluxClusterExecutor, task_loop_dict) # result_dict = run_gpaw_jobs(FluxJobExecutor, task_loop_dict)

coderabbitai · 2025-03-28T15:55:16Z

notebooks/5-1-gpaw.ipynb

+      "<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "/tmp/ipykernel_21313/4024930470.py:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "  plt.xlabel(\"Volume [$\\AA^3$]\")\n"


⚠️ Potential issue

Fix the SyntaxWarning for the invalid escape sequence.

The plotting code has an invalid escape sequence \A in the xlabel string.

Apply this fix to correct the escape sequence:

- plt.xlabel("Volume [$\\AA^3$]") + plt.xlabel("Volume [$\\mathrm{\\AA}^3$]")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"/tmp/ipykernel_21313/4024930470.py:2: SyntaxWarning: invalid escape sequence '\\A'\n",

" plt.xlabel(\"Volume [$\\AA^3$]\")\n"

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"/tmp/ipykernel_21313/4024930470.py:2: SyntaxWarning: invalid escape sequence '\\A'\n",

" plt.xlabel(\"Volume [$\\mathrm{\\AA}^3$]\")\n"

coderabbitai · 2025-03-28T15:55:16Z

notebooks/5-1-gpaw.ipynb

+     "text": [
+      "<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "/tmp/ipykernel_21313/4024930470.py:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "  plt.xlabel(\"Volume [$\\AA^3$]\")\n"
+     ]


⚠️ Potential issue

Same SyntaxWarning for invalid escape sequence appears again.

The same issue with the invalid escape sequence \A appears in the second plotting section.

Apply the same fix:

- plt.xlabel("Volume [$\\AA^3$]") + plt.xlabel("Volume [$\\mathrm{\\AA}^3$]")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"text": [

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"/tmp/ipykernel_21313/4024930470.py:2: SyntaxWarning: invalid escape sequence '\\A'\n",

" plt.xlabel(\"Volume [$\\AA^3$]\")\n"

]

"text": [

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",

"/tmp/ipykernel_21313/4024930470.py:2: SyntaxWarning: invalid escape sequence '\\A'\n",

" plt.xlabel(\"Volume [$\\mathrm{\\AA}^3$]\")\n"

]

coderabbitai · 2025-03-28T15:55:16Z

notebooks/5-2-quantum-espresso.ipynb

+    "pprint.pp(subprocess.check_output([\"flux\", \"resource\", \"list\"], universal_newlines=True).split(\"\\n\"))"
+   ]


🛠️ Refactor suggestion

Add error handling for subprocess calls and flux commands.
Commands like subprocess.check_output(["flux", "resource", "list"]) lack exception handling, which may cause unhandled failures if the command is unavailable or returns an error. Wrap these calls in try-except blocks or validate return codes for more robust error handling.

Also applies to: 130-131, 216-216, 222-222, 398-398, 404-404

coderabbitai · 2025-03-28T15:55:16Z

notebooks/5-2-quantum-espresso.ipynb

+     "text": [
+      "<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "<>:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "/tmp/ipykernel_31059/4024930470.py:2: SyntaxWarning: invalid escape sequence '\\A'\n",
+      "  plt.xlabel(\"Volume [$\\AA^3$]\")\n"
+     ]


⚠️ Potential issue

Fix invalid escape sequences in plot labels.
You have SyntaxWarning logs referring to invalid escape sequences (e.g., \\A). Update the Volume [$\\AA^3$] string by adding an extra backslash, using raw string notation, or doubling backslashes to avoid these warnings.

Apply this diff to fix one instance of the escape sequence (repeat as needed):

-plt.xlabel("Volume [$\\AA^3$]") +plt.xlabel("Volume [$\\AA^3$]") # Example fix: double the backslash or use raw string

Also applies to: 433-437

coderabbitai · 2025-03-28T15:55:17Z

notebooks/5-2-quantum-espresso.ipynb

+    "    sleep(1)\n",
+    "    pprint.pp(subprocess.check_output([\"flux\", \"jobs\", \"-a\"], universal_newlines=True).split(\"\\n\"))\n",
+    "    result_dict = {\n",
+    "        k: f.result() \n",
+    "        for k, f in tqdm(future_dict.items())\n",
+    "    }\n",
+    "    sleep(1)\n",


🛠️ Refactor suggestion

Revisit usage of 'sleep(1)' in the HPC job flow.
Multiple sleeps may not guarantee that flux jobs have initialized or completed. This approach can introduce race conditions or extraneous delays. A more reliable job-polling mechanism or an event-based trigger can lead to a more robust workflow.

Also applies to: 398-404

coderabbitai · 2025-03-28T15:55:17Z

notebooks/5-2-quantum-espresso.ipynb

+    "with FluxJobExecutor(flux_executor_nesting=True) as exe:\n",
+    "    for k, v in task_loop_dict.items():\n",
+    "        os.makedirs(os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")), exist_ok=True)\n",
+    "        future_dict[k] = exe.submit(\n",
+    "            evaluate_with_quantum_espresso, \n",
+    "            task_dict=v, \n",
+    "            pseudopotentials=pseudopotentials, \n",
+    "            kpts=(3, 3, 3), \n",
+    "            resource_dict={\n",
+    "                \"cores\": 1, \n",
+    "                \"threads_per_core\": 2, \n",
+    "                \"cwd\": os.path.abspath((\"strain_%0.2f\" % k).replace(\".\", \"_\")),\n",
+    "            },\n",
+    "        )\n",
+    "    sleep(1)\n",
+    "    pprint.pp(subprocess.check_output([\"flux\", \"jobs\", \"-a\"], universal_newlines=True).split(\"\\n\"))\n",
+    "    result_dict = {\n",
+    "        k: f.result() \n",
+    "        for k, f in tqdm(future_dict.items())\n",
+    "    }\n",
+    "    sleep(1)\n",
+    "    pprint.pp(subprocess.check_output([\"flux\", \"jobs\", \"-a\"], universal_newlines=True).split(\"\\n\"))"
+   ]


🛠️ Refactor suggestion

Reduce code duplication between FluxClusterExecutor and FluxJobExecutor usage.
These two code blocks share repetitive logic for job submission and result collection. Consider abstracting these workflows into a common function or a single executor-based approach for easier long-term maintenance.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

docs/application.md (1)
1-7: Documentation Section: Clarity and Consistency
The new "Application" section provides a clear overview of how executorlib integrates with atomistic simulation tools. One minor nitpick: consider replacing “up-scale” with “upscale” for improved clarity and consistency with common usage.

Suggested Diff:
-While `executorlib` is designed to up-scale any Python function for high performance computing (HPC), it was initially
+While `executorlib` is designed to upscale any Python function for high performance computing (HPC), it was initially
🧰 Tools

🪛 LanguageTool

[uncategorized] ~2-~2: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ned to up-scale any Python function for high performance computing (HPC), it was initially devel...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f444fb and 4176dd6.

📒 Files selected for processing (3)

README.md (1 hunks)
docs/_toc.yml (1 hunks)
docs/application.md (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

docs/_toc.yml

🧰 Additional context used

🪛 LanguageTool

docs/application.md

[uncategorized] ~2-~2: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ned to up-scale any Python function for high performance computing (HPC), it was initially devel...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

⏰ Context from checks skipped due to timeout of 90000ms (4)

GitHub Check: notebooks_integration
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: unittest_win

🔇 Additional comments (1)

README.md (1)

142-144: New Documentation Links: Application Section
The newly added entries for [Application] with nested links for [GPAW] and [Quantum Espresso] are correctly formatted and align well with the overall documentation structure. These links effectively direct users to the corresponding detailed pages.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (9)

docs/application.md (5)

2-2: Compound Adjective Formatting:
The phrase “up-scale any Python function” might be improved by either removing the hyphen (i.e., “upscale any Python function”) or rephrasing it (e.g., “scale up any Python function”) to adhere to common conventions and enhance clarity.

🧰 Tools

🪛 LanguageTool

[uncategorized] ~2-~2: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ned to up-scale any Python function for high performance computing (HPC), it was initially devel...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

2-6: Technical Context and Consistency:
The description of how executorlib accelerates atomistic simulation is informative. Consider ensuring consistent naming—if “atomistics” is a formal project name, it might be capitalized—and verify that all tool names (e.g., the atomic simulation environment, GPAW, Quantum Espresso) match their official documentation for consistency.

🧰 Tools

🪛 LanguageTool

[uncategorized] ~2-~2: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ned to up-scale any Python function for high performance computing (HPC), it was initially devel...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

7-9: Detail Enhancement for Material Properties:
The explanation of deriving the bulk modulus via energy changes is clear. For added clarity, you might consider linking to additional resources or documentation on the numerical methods (derivatives) used in this context, which could assist users seeking deeper technical insight.

10-11: Extraneous Line Detected:
There appears to be a stray “11” at the end of the document. If this is unintentional, please remove it to maintain a clean format.

1-11: Alignment with PR Objectives:
The PR title and objectives mention “integration tests for application notebooks,” yet this document does not reference the notebooks (e.g., “5-1-gpaw.ipynb” or “5-2-quantum-espresso.ipynb”) or their associated integration tests. Consider adding a section or a note that links to these notebooks or explains how users can run the integration tests. This will help align the documentation with the broader PR objectives.

🧰 Tools

🪛 LanguageTool

[uncategorized] ~2-~2: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ned to up-scale any Python function for high performance computing (HPC), it was initially devel...

(EN_COMPOUND_ADJECTIVE_INTERNAL)
notebooks/5-1-gpaw.ipynb (3)
44-53: Add docstring to the evaluate_with_gpaw function.

For better code clarity and documentation, add a docstring to explain the function parameters and return value.
def evaluate_with_gpaw(task_dict, kpts, encut):
+    """
+    Calculate potential energy using GPAW.
+    
+    Args:
+        task_dict: Dictionary containing structure to calculate energy for
+        kpts: k-point sampling as a tuple (e.g., (3, 3, 3))
+        encut: Energy cutoff for plane-wave basis in eV
+        
+    Returns:
+        Potential energy of the structure
+    """
    from gpaw import GPAW, PW

    structure = task_dict["calc_energy"].copy()
    structure.calc = GPAW(
        xc="PBE",
        mode=PW(encut),
        kpts=kpts,
    )
    return structure.get_potential_energy()
214-231: Add error handling for job results collection.

To make the code more robust, add error handling when collecting results from the submitted jobs. This would help identify which specific calculation failed if there's an issue.
    result_dict = {
-        k: f.result()[-1] 
-        for k, f in tqdm(future_dict.items())
+        k: f.result()[-1] if f.done() else None
+        for k, f in tqdm(future_dict.items())
+        try:
+            result_dict[k] = f.result()[-1]
+        except Exception as e:
+            print(f"Job for {k} failed with error: {e}")
+            result_dict[k] = None
    }
423-428: Add error handling for FluxJobExecutor results collection as well.

Similar to the suggestion for the FluxClusterExecutor section, add error handling when collecting results from the jobs in the FluxJobExecutor section.
    result_dict = {
-        k: f.result()[-1] 
-        for k, f in tqdm(future_dict.items())
+        k: f.result()[-1] if f.done() else None
+        for k, f in tqdm(future_dict.items())
+        try:
+            result_dict[k] = f.result()[-1]
+        except Exception as e:
+            print(f"Job for {k} failed with error: {e}")
+            result_dict[k] = None
    }
notebooks/5-2-quantum-espresso.ipynb (1)
39-55: Consider adding a minimal try-except wrapper for evaluate_with_quantum_espresso.
While the function logic is straightforward, you might gracefully handle unexpected errors (e.g., missing pseudopotential files or invalid k-point grids) by logging or re-raising an informative exception.
 def evaluate_with_quantum_espresso(task_dict, pseudopotentials, kpts):
   from ase.calculators.espresso import Espresso, EspressoProfile
   from atomistics.calculators import evaluate_with_ase

+  try:
       return evaluate_with_ase(
           task_dict=task_dict,
           ase_calculator=Espresso(
               pseudopotentials=pseudopotentials,
               tstress=True,
               tprnfor=True,
               kpts=kpts,
               profile=EspressoProfile(
                    command="flux run pw.x",
                    pseudo_dir="/home/runner/work/executorlib/executorlib/tests/integration",
               ),
           ),
       )["energy"]
+  except Exception as e:
+      raise RuntimeError(f"Quantum Espresso evaluation failed: {e}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4176dd6 and 33087ad.

📒 Files selected for processing (3)

docs/application.md (1 hunks)
notebooks/5-1-gpaw.ipynb (1 hunks)
notebooks/5-2-quantum-espresso.ipynb (1 hunks)

🧰 Additional context used

🪛 LanguageTool

docs/application.md

[uncategorized] ~2-~2: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ned to up-scale any Python function for high performance computing (HPC), it was initially devel...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: unittest_win
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: notebooks_integration

🔇 Additional comments (15)

docs/application.md (1)

1-1: Header Clarity:
The "# Application" header is clear and effectively introduces the section.
notebooks/5-1-gpaw.ipynb (6)

271-274: Fix the SyntaxWarning for invalid escape sequence (previously noted).

The plotting code still has the invalid escape sequence \A in the xlabel string. This was noted in a previous review. You should fix this to avoid the warning.
- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel("Volume [$\\mathrm{\\AA}^3$]")
467-472: Same SyntaxWarning for invalid escape sequence appears again.

The same invalid escape sequence issue appears in the second plotting section as well, which was previously noted.
- plt.xlabel("Volume [$\\AA^3$]")
+ plt.xlabel("Volume [$\\mathrm{\\AA}^3$]")
412-428: Eliminate code duplication between executor sections.

Both the FluxClusterExecutor and FluxJobExecutor sections contain almost identical code, as noted in a previous review. Consider refactoring this into a reusable function:
def run_gpaw_jobs(executor_class, task_dict, kpts=(3, 3, 3), encut=300, cores=2):
    """Run GPAW calculations using the specified executor."""
    future_dict = {}
    with executor_class() as exe:
        for k, v in task_dict.items():
            future_dict[k] = exe.submit(
                evaluate_with_gpaw, 
                task_dict=v, 
                kpts=kpts, 
                encut=encut,
                resource_dict={"cores": cores},
            )
        sleep(1)
        pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n"))
        result_dict = {}
        for k, f in tqdm(future_dict.items()):
            try:
                result_dict[k] = f.result()[-1]
            except Exception as e:
                print(f"Job for {k} failed with error: {e}")
        sleep(1)
        pprint.pp(subprocess.check_output(["flux", "jobs", "-a"], universal_newlines=True).split("\n"))
    return result_dict
Then use it like:
result_dict = run_gpaw_jobs(FluxClusterExecutor, task_loop_dict)
1-11: Well-structured notebook introduction with clear explanation.

The notebook starts with a clear introduction of GPAW and how it integrates with executorlib. The explanation of hierarchical workflows and the benefits of using executorlib for parallelization is informative and sets the stage for the rest of the notebook.

69-74: The structure generation implementation is clear and well-parameterized.

The code for generating strained aluminum structures is concise and uses appropriate parameters. The use of named parameters improves readability and makes it clear what each value represents.

145-147: Good explanation of the choice of executor.

The introduction to the FluxClusterExecutor section provides helpful context about why this executor is used in the demonstration and mentions an alternative (SlurmClusterExecutor) that could be used analogously, which is valuable information for users.
notebooks/5-2-quantum-espresso.ipynb (8)

19-30: Imports appear organized and consistent.
No immediate issues are spotted regarding the import statements, and they provide a clear overview of the required dependencies for the notebook.

113-114: Add error handling for subprocess calls.
The Flux command is invoked directly without try-except. This matches a previously flagged concern regarding unhandled failures if the command is unavailable or returns an error.

131-131: Repeat the earlier request for error handling on Flux command.
This subprocess call is also unguarded in a try-except, mirroring the unhandled scenario from previous lines.

216-224: Revisit usage of sleep(1) in HPC job flow.
This approach may introduce a race condition or extraneous delay if Flux jobs do not initialize or terminate within the fixed time interval. Implement a more robust polling mechanism or an event-based trigger to confirm job readiness.

280-280: Fix the invalid escape sequence in the plot label.
“\A” can trigger a SyntaxWarning in Python. As noted in earlier reviews, double the backslash or use a raw string.
-plt.xlabel("Volume [$\\AA^3$]")
+plt.xlabel(r"Volume [$\AA^3$]")
385-407: Reduce code duplication between FluxClusterExecutor and FluxJobExecutor usage.
These blocks share mostly identical resource dictionary construction and subsequent polling logic. Consider refactoring out the common logic into a helper function for long-term maintainability.

399-405: Revisit sleep(1) usage once again.
Similar to the earlier comment, relying on an arbitrary delay can produce timing inconsistencies in HPC job flows. A job polling API or callback-based approach is recommended for reliability.

463-463: Address the second invalid escape sequence in plt.xlabel.
The same fix applies here to avoid SyntaxWarnings for “\A.”
-plt.xlabel("Volume [$\\AA^3$]")
+plt.xlabel(r"Volume [$\AA^3$]")

Add integration tests for application notebooks

8d5ceb0

jan-janssen added 2 commits March 28, 2025 11:23

remove pmix

ba8f614

remove constraints

a31e50d

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

jan-janssen added 2 commits March 28, 2025 11:36

fix notebook names

6138780

increase time out

d046292

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

jan-janssen added 2 commits March 28, 2025 11:54

Merge remote-tracking branch 'origin/main' into integration

4fc880e

remove ASE command

7db29e7

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

jan-janssen added 2 commits March 28, 2025 12:17

Disable qe cluster test

2316b1d

fix pseudo potential directory

b8daa95

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

notebooks/5-4-flux-job-qe-bulk-modulus.ipynb Outdated Show resolved Hide resolved

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb Outdated Show resolved Hide resolved

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb Outdated Show resolved Hide resolved

relative to working directory

9f23e24

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb Outdated Show resolved Hide resolved

notebooks/5-3-flux-cluster-qe-bulk-modulus.ipynb Outdated Show resolved Hide resolved

try absolute path

9e60689

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

Update pipeline.yml

c0f8ff4

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

jan-janssen added 2 commits March 28, 2025 14:52

Merge remote-tracking branch 'origin/main' into integration

eebacdb

Test new debugging capability

74b8710

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

use flux run to call pw.x

5140389

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

try p2.x with mpi

d07609a

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

mpi takes too long

80fbc1a

jan-janssen added 2 commits March 28, 2025 15:28

enable 3rd test

b4d53c5

always nest flux

0d1ad0b

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

no more nesting

4ba5f59

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

jan-janssen added 4 commits March 28, 2025 15:56

no more nesting

b59404f

merge notebooks

0dcd539

fix pseudo potential directory

12a5c1c

add quantum espresso

1f444fb

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

Short introduction

4176dd6

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

Add some more documentation

33087ad

coderabbitai bot reviewed Mar 28, 2025

View reviewed changes

Finish up documentation

63733e7

jan-janssen merged commit 3cbd96f into main Mar 28, 2025
74 of 76 checks passed

jan-janssen deleted the integration branch March 28, 2025 20:20

coderabbitai bot mentioned this pull request Sep 9, 2025

Update pyzmq to 27.1.0 #817

Merged

This was referenced Oct 4, 2025

Fix integration notebook #832

Closed

update notebooks #831

Closed

		"pprint.pp(subprocess.check_output([\"flux\", \"resource\", \"list\"], universal_newlines=True).split(\"\\n\"))"
		]

Add integration tests for application notebooks #619

Add integration tests for application notebooks #619

Uh oh!

Conversation

jan-janssen commented Mar 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

review-notebook-app bot commented Mar 28, 2025

Uh oh!

coderabbitai bot commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

codecov bot commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 28, 2025

jan-janssen commented Mar 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 28, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Mar 28, 2025 •

edited

Loading