Skip to content

Feature/SOF-7782 Maxwell disorder NB#261

Merged
VsevolodX merged 10 commits into
mainfrom
feature/SOF-7782
Dec 27, 2025
Merged

Feature/SOF-7782 Maxwell disorder NB#261
VsevolodX merged 10 commits into
mainfrom
feature/SOF-7782

Conversation

@VsevolodX
Copy link
Copy Markdown
Member

@VsevolodX VsevolodX commented Dec 27, 2025

Summary by CodeRabbit

  • New Features

    • Maxwell-Boltzmann thermal disorder workflow added with configurable disorder, seed, visualization, supercell option, and export.
  • Bug Fixes / Improvements

    • Improved helper utilities for job monitoring, data display, and file handling used across examples and demos (no public API signature changes).
    • Notebook metadata normalized (execution state/outputs) for consistency.
  • Chores

    • Removed automatic package-install step from some webinar scripts.
  • Documentation

    • New tutorial notebook documenting the thermal disorder workflow and usage.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI and others added 6 commits December 24, 2025 02:11
Co-authored-by: VsevolodX <79542055+VsevolodX@users.noreply.github.com>
Co-authored-by: VsevolodX <79542055+VsevolodX@users.noreply.github.com>
Co-authored-by: VsevolodX <79542055+VsevolodX@users.noreply.github.com>
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 27, 2025

📝 Walkthrough

Walkthrough

Introduces a new Maxwell-Boltzmann thermal disorder tutorial and updates the Introduction TOC; splits and relocates many utility helpers from utils/generic.py into two new modules utils/api.py and utils/visualize.py, and updates numerous notebooks and scripts to import from the new locations.

Changes

Cohort / File(s) Summary
New tutorial & TOC
other/materials_designer/create_maxwell_disorder.ipynb, other/materials_designer/Introduction.ipynb
Added new Maxwell-Boltzmann thermal disorder notebook; added "5.2. Thermal Disorder" link in Introduction TOC.
Utils refactor — API helpers
utils/api.py, utils/generic.py
Added utils/api.py with save_files, get_jobs_statuses_by_ids, wait_for_jobs_to_finish, copy_bank_workflow_by_system_name, get_property_by_subworkflow_and_unit_indicies, get_cluster_name; removed those functions from utils/generic.py.
Utils refactor — Visualization helpers
utils/visualize.py, utils/generic.py
Added dataframe_to_html and in-file display_JSON in utils/visualize.py; removed those from utils/generic.py; updated callers to import from utils.visualize.
Notebooks & scripts — import updates / metadata normalization
examples/*/*.ipynb, other/*/*.ipynb, other/*/*.py (many files; e.g., examples/job/*.ipynb, examples/material/*.ipynb, other/materialsproject/*.py, other/webinar/*.py)
Updated imports to use utils.api and utils.visualize instead of utils.generic; normalized Jupyter cell metadata (execution_count, outputs, source formatting); minor JSON/data edits (e.g., other/materials_designer/uploads/0-Ni.json source string change).
Webinar scripts — installer removal
other/webinar/generate-al2o3-slab-structures.py, other/webinar/wulff-construction-surface-energy-study-cu.py
Removed automatic package-install helper call; adjusted imports to utils.api.save_files and utils.visualize.display_JSON.

Sequence Diagram(s)

sequenceDiagram
    %% Styling
    rect rgba(220,235,255,0.6)
      participant Notebook as Notebook (user)
      participant UtilAPI as utils.api
      participant JobEP as JobEndpoints
      participant Console as Console/Tabulate
    end

    Note over Notebook,UtilAPI: Job monitoring flow (wait_for_jobs_to_finish)

    Notebook->>UtilAPI: call wait_for_jobs_to_finish(job_ids)
    UtilAPI->>JobEP: get job statuses for job_ids
    JobEP-->>UtilAPI: return statuses
    alt not all finished
      UtilAPI->>Console: print status grid
      UtilAPI->>JobEP: sleep & poll again (loop)
    else all finished
      UtilAPI->>Notebook: return (all done)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • timurbazhirov

Poem

🐰
I hopped through notebooks, bright and merry,
Moved helpers, styled tables, made JSON cherry.
Maxwell shuffled atoms, warm and spry,
New notebooks teach — give them a try! 🎩✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Feature/SOF-7782 Maxwell disorder NB' is partially related to the changeset, referring to a real aspect (Maxwell disorder notebook), but obscures the broader refactoring effort with module reorganization. Consider clarifying the title to reflect the main change: either focus on 'Add Maxwell thermal disorder notebook' or expand to 'Refactor utils modules and add Maxwell disorder notebook' to capture the full scope.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/SOF-7782

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
other/materials_designer/create_maxwell_disorder.ipynb (3)

125-138: Consider adding input validation for better error handling.

Line 128 accesses materials[MATERIAL_INDEX] without checking if the list is empty or if the index is valid. While the usage instructions specify selecting input materials beforehand, adding validation would provide clearer error messages if users skip this step.

🔎 Suggested validation
+if not materials:
+    raise ValueError("No materials provided. Please select input materials before running the notebook.")
+if MATERIAL_INDEX >= len(materials):
+    raise IndexError(f"MATERIAL_INDEX {MATERIAL_INDEX} is out of range. Available materials: 0-{len(materials)-1}")
+
 unit_cell = materials[MATERIAL_INDEX]

186-192: Consider removing or documenting commented-out code.

Lines 186-190 contain a commented visualization block. For a tutorial notebook:

  • If this alternative approach is useful as an example, consider uncommenting it or adding a comment explaining why it's preserved.
  • If it's leftover from development, consider removing it to avoid confusion.

76-86: Inconsistent quote style in micropip.install calls.

Line 81 uses double quotes while line 82 uses single quotes for the micropip.install() calls. Consider using consistent quote style throughout.

Note: The static analysis warnings about top-level await statements are false positives—top-level await is valid in Jupyter notebooks with IPython 7.0+.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2dc2025 and 52d228f.

📒 Files selected for processing (2)
  • other/materials_designer/Introduction.ipynb
  • other/materials_designer/create_maxwell_disorder.ipynb
🧰 Additional context used
🪛 Ruff (0.14.10)
other/materials_designer/create_maxwell_disorder.ipynb

23-23: await statement outside of a function

(F704)


24-24: await statement outside of a function

(F704)


27-27: await statement outside of a function

(F704)

🔇 Additional comments (6)
other/materials_designer/Introduction.ipynb (1)

85-87: LGTM! TOC entry properly added.

The new thermal disorder section is correctly formatted and follows the existing structure conventions.

other/materials_designer/create_maxwell_disorder.ipynb (5)

8-26: LGTM! Clear documentation.

The introduction provides comprehensive usage instructions and helpful context about the Maxwell-Boltzmann thermal disorder implementation.


44-58: LGTM! Sensible configuration defaults.

The parameters provide reasonable defaults (room temperature at 300K, reproducible seed, mass-dependent displacement). Parameter validation is expected to be handled by downstream functions.


104-107: LGTM! Standard material retrieval pattern.

The material retrieval follows the expected pattern used across notebooks.


157-168: LGTM! Disorder application is correct.

The Maxwell-Boltzmann displacement is properly applied with appropriate parameters, and the output provides useful feedback.


209-212: LGTM! Standard output pattern.

The material output follows the expected pattern for passing results to the external runtime.

Feature/SOF-7782-1 Chore: necessary update to separate generic and api utils
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
other/materials_designer/uploads/0-Ni.json (1)

55-61: Fix inconsistency with other material data files.

The external.source value should be "materials project" (lowercase with space) to match the format used in other similar JSON files in the repository (other/experiments/uploads/Ni.json, other/experiments/uploads/Gr.json). The current change to "MaterialsProject" creates inconsistent data formatting.

examples/job/run-simulations-and-extract-properties.ipynb (1)

553-571: Logic error: Loop body appears to only execute for the last material.

The for-loop at line 554 only contains the job = next(...) statement. Lines 556-571 (starting with final_structure = ...) are outside the loop and will only execute once using the last job value. This means results will only contain data for the last material.

🔎 Proposed fix - indent the loop body
 results = []
 for material in materials:
     job = next((job for job in jobs if job["_material"]["_id"] == material["_id"]))
-final_structure = get_property_by_subworkflow_and_unit_indicies(property_endpoints, "final_structure", job, 0, 0)[
-    "data"
-]
-pressure = get_property_by_subworkflow_and_unit_indicies(property_endpoints, "pressure", job, 0, 0)["data"]["value"]
-unit_flowchart_id = job["workflow"]["subworkflows"][0]["units"][1]["flowchartId"]
-band_gap_direct = property_endpoints.get_direct_band_gap(job["_id"], unit_flowchart_id)
-band_gap_indirect = property_endpoints.get_indirect_band_gap(job["_id"], unit_flowchart_id)
-results.append(
-    {
-        "initial_structure": material,
-        "final_structure": final_structure,
-        "pressure": pressure,
-        "band_gap_direct": band_gap_direct,
-        "band_gap_indirect": band_gap_indirect,
-    }
-)
+    final_structure = get_property_by_subworkflow_and_unit_indicies(property_endpoints, "final_structure", job, 0, 0)[
+        "data"
+    ]
+    pressure = get_property_by_subworkflow_and_unit_indicies(property_endpoints, "pressure", job, 0, 0)["data"]["value"]
+    unit_flowchart_id = job["workflow"]["subworkflows"][0]["units"][1]["flowchartId"]
+    band_gap_direct = property_endpoints.get_direct_band_gap(job["_id"], unit_flowchart_id)
+    band_gap_indirect = property_endpoints.get_indirect_band_gap(job["_id"], unit_flowchart_id)
+    results.append(
+        {
+            "initial_structure": material,
+            "final_structure": final_structure,
+            "pressure": pressure,
+            "band_gap_direct": band_gap_direct,
+            "band_gap_indirect": band_gap_indirect,
+        }
+    )
examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (1)

826-840: Missing import for display function.

The display(df) call on line 830 uses IPython's display function, but it's not imported in this cell or earlier in the notebook. This will raise a NameError at runtime.

🔎 Proposed fix - add display import

Add the import at the beginning of this cell:

+from IPython.display import display
 from matplotlib import pyplot as plt
 import pandas as pd
🧹 Nitpick comments (2)
utils/api.py (1)

36-36: Audit URL scheme before opening.

Static analysis flags urllib.request.urlopen (S310) because it can accept arbitrary schemes like file://. Since signed_url comes from the API, the risk is low, but consider validating the scheme if this function may be exposed to untrusted input.

🔎 Optional validation
+    from urllib.parse import urlparse
+    
+    parsed = urlparse(signed_url)
+    if parsed.scheme not in ("http", "https"):
+        raise ValueError(f"Unsupported URL scheme: {parsed.scheme}")
+    
     server_response = urllib.request.urlopen(signed_url)
utils/visualize.py (1)

63-63: Avoid shadowing the built-in id.

The variable id shadows Python's built-in function. Consider renaming to element_id or div_id.

🔎 Suggested fix
-        id = str(uuid.uuid4())
+        element_id = str(uuid.uuid4())

And update the references on lines 73 and 79:

-        display(HTML(f'<style>{css}</style><div id="{id}"></div>'))
+        display(HTML(f'<style>{css}</style><div id="{element_id}"></div>'))
-                f'document.getElementById("{id}").appendChild(renderjson({json_str}))</script>'
+                f'document.getElementById("{element_id}").appendChild(renderjson({json_str}))</script>'
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52d228f and 3293d7f.

📒 Files selected for processing (25)
  • examples/job/create_and_submit_job.ipynb
  • examples/job/get-file-from-job.ipynb
  • examples/job/ml-train-model-predict-properties.ipynb
  • examples/job/run-simulations-and-extract-properties.ipynb
  • examples/material/create_material.ipynb
  • examples/material/get_materials_by_formula.ipynb
  • examples/material/upload_materials_from_file_poscar.ipynb
  • examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/system/get_authentication_params.ipynb
  • examples/workflow/get_workflows.ipynb
  • examples/workflow/qe_scf_calculation.ipynb
  • other/experiments/create_interface_with_min_energy_by_miller_indices.ipynb
  • other/jarvis/run_job_using_material_from_jarvis_db.ipynb
  • other/materials_designer/uploads/0-Ni.json
  • other/materials_designer/workflows/band_gap.ipynb
  • other/materialsproject/api_interoperability_showcase.ipynb
  • other/materialsproject/api_interoperability_showcase.py
  • other/materialsproject/import_materials_from_materialsproject.ipynb
  • other/materialsproject/import_materials_from_materialsproject.py
  • other/webinar/generate-al2o3-slab-structures.py
  • other/webinar/wulff-construction-surface-energy-study-cu.py
  • utils/api.py
  • utils/generic.py
  • utils/visualize.py
💤 Files with no reviewable changes (1)
  • utils/generic.py
🧰 Additional context used
🧬 Code graph analysis (4)
other/webinar/generate-al2o3-slab-structures.py (2)
utils/api.py (1)
  • save_files (14-40)
utils/visualize.py (1)
  • display_JSON (44-83)
other/materialsproject/api_interoperability_showcase.py (1)
utils/visualize.py (1)
  • display_JSON (44-83)
other/webinar/wulff-construction-surface-energy-study-cu.py (1)
utils/api.py (1)
  • save_files (14-40)
other/materialsproject/import_materials_from_materialsproject.py (1)
utils/visualize.py (1)
  • display_JSON (44-83)
🪛 Ruff (0.14.10)
examples/system/get_authentication_params.ipynb

49-49: await statement outside of a function

(F704)

other/experiments/create_interface_with_min_energy_by_miller_indices.ipynb

315-315: Ambiguous variable name: l

(E741)

examples/job/run-simulations-and-extract-properties.ipynb

188-188: Found useless expression. Either assign it to a variable or remove it.

(B018)

utils/visualize.py

77-77: Use explicit conversion flag

Replace with conversion flag

(RUF010)

examples/job/ml-train-model-predict-properties.ipynb

191-191: Found useless expression. Either assign it to a variable or remove it.

(B018)

utils/api.py

36-36: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb

37-37: await statement outside of a function

(F704)


338-338: Undefined name display

(F821)

🔇 Additional comments (29)
other/materialsproject/import_materials_from_materialsproject.py (1)

56-56: LGTM!

The import path update from utils.generic to utils.visualize for display_JSON aligns with the broader utility refactoring in this PR. The function signature and usage remain compatible.

examples/job/create_and_submit_job.ipynb (1)

108-108: LGTM!

The import path change to utils.visualize is consistent with the refactoring across other notebooks in this PR.

other/materialsproject/api_interoperability_showcase.py (1)

58-58: LGTM!

The import path update to utils.visualize is consistent with the project-wide refactoring. Note that display_JSON appears to be imported but not used in this script version—this may be intentional if the corresponding notebook uses it interactively.

examples/workflow/qe_scf_calculation.ipynb (1)

144-151: LGTM!

The import of wait_for_jobs_to_finish from the new utils.api module is correct and aligns with the utility refactoring. The function usage at line 340 properly passes the required parameters.

utils/api.py (1)

1-11: Note: tabulate import is unused at module level but used in wait_for_jobs_to_finish.

The import is valid and used within the function. No action needed.

examples/material/upload_materials_from_file_poscar.ipynb (1)

121-124: LGTM!

The import path update to utils.visualize is consistent with the project-wide refactoring.

other/materialsproject/api_interoperability_showcase.ipynb (1)

124-131: LGTM!

The import path update to utils.visualize is consistent with the project-wide refactoring. Note that display_JSON appears to be imported but not used in this notebook—consider removing the unused import if it's not needed.

examples/workflow/get_workflows.ipynb (1)

123-123: LGTM - Import path updated correctly.

The import path for display_JSON has been updated from utils.generic to utils.visualize, aligning with the PR's refactoring objectives to reorganize utility functions.

other/materialsproject/import_materials_from_materialsproject.ipynb (1)

98-98: LGTM - Import path updated correctly.

The import path for display_JSON has been updated from utils.generic to utils.visualize, consistent with the module reorganization.

other/materials_designer/workflows/band_gap.ipynb (3)

443-443: LGTM - Import path updated correctly.

The import path for get_cluster_name has been updated from utils.generic to utils.api, properly categorizing this API-related utility.


487-487: LGTM - Import path updated correctly.

The import path for display_JSON has been updated from utils.generic to utils.visualize, consistent with the visualization utilities reorganization.


536-536: LGTM - Import path updated correctly.

The import path for wait_for_jobs_to_finish has been updated from utils.generic to utils.api, properly categorizing this API-related utility.

other/webinar/wulff-construction-surface-energy-study-cu.py (1)

27-27: LGTM - Import updated to new utility module.

The import has been updated to use save_files from utils.api, aligning with the refactoring that consolidates API-related utilities. The removal of ensure_packages_are_installed appears intentional per the PR objectives.

examples/material/create_material.ipynb (1)

118-118: LGTM - Import path updated correctly.

The import path for display_JSON has been updated from utils.generic to utils.visualize, consistent with the module reorganization.

examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (3)

287-287: LGTM - Import path updated correctly.

The import path for display_JSON has been updated from utils.generic to utils.visualize, consistent with the visualization utilities reorganization.


398-399: LGTM - Import paths updated correctly.

Both import paths have been updated appropriately:

  • display_JSON moved to utils.visualize
  • copy_bank_workflow_by_system_name moved to utils.api

This properly separates visualization and API-related utilities.


749-749: LGTM - Import path updated correctly.

The import path for wait_for_jobs_to_finish has been updated from utils.generic to utils.api, properly categorizing this API-related utility.

other/jarvis/run_job_using_material_from_jarvis_db.ipynb (1)

250-251: LGTM - Import paths updated correctly.

Both import statements have been updated appropriately:

  • API utilities (wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies) moved to utils.api
  • Visualization utilities (dataframe_to_html, display_JSON) moved to utils.visualize

This properly separates concerns between API and visualization functionality.

other/webinar/generate-al2o3-slab-structures.py (1)

35-36: LGTM - Imports updated to new utility modules.

Both import paths have been updated appropriately:

  • save_files moved to utils.api for API-related file operations
  • display_JSON moved to utils.visualize for visualization utilities

This aligns with the refactoring that consolidates related utilities into dedicated modules. The removal of ensure_packages_are_installed appears intentional per the PR objectives.

examples/system/get_authentication_params.ipynb (1)

134-137: LGTM - Import relocation is consistent with PR objectives.

The import of display_JSON from utils.visualize aligns with the broader refactoring that consolidates visualization utilities into the new module.

examples/material/get_materials_by_formula.ipynb (1)

117-122: LGTM - Import update follows the new module structure.

The display_JSON import correctly points to the new utils.visualize module location.

utils/visualize.py (2)

28-41: LGTM - Clean DataFrame to HTML conversion utility.

The implementation correctly applies table styles and returns a Styler object for proper rendering in Jupyter environments.


44-84: LGTM - Well-structured JSON display with dual-mode support.

The function properly handles both interactive (web-based) and fallback (print) modes. The docstring clearly explains the parameters and behavior.

examples/job/get-file-from-job.ipynb (1)

142-157: LGTM - Import reorganization follows the new module structure.

The API utilities (wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies) are correctly imported from utils.api, while visualization utilities (dataframe_to_html, display_JSON) are imported from utils.visualize.

examples/job/run-simulations-and-extract-properties.ipynb (1)

153-166: LGTM - Import updates consistent with module reorganization.

The imports from utils.api and utils.visualize correctly reflect the new module structure.

examples/job/ml-train-model-predict-properties.ipynb (1)

143-164: LGTM - Import reorganization follows the new module structure.

The API utilities are correctly imported from utils.api and visualization utilities from utils.visualize. The structure is clean and consistent with the other notebooks in this PR.

examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (3)

287-301: LGTM - Import relocation consistent with PR objectives.

The display_JSON import from utils.visualize follows the established pattern.


394-403: LGTM - API utility import follows new module structure.

The copy_bank_workflow_by_system_name import from utils.api is correctly placed.


762-767: LGTM - API utility imports correctly placed.

The wait_for_jobs_to_finish and get_property_by_subworkflow_and_unit_indicies are properly imported from utils.api.

Comment on lines 593 to 599
"source": [
"table = []\n",
"for result in results:\n",
" data = flatten_material(result[\"initial_structure\"])\n",
"data.extend(flatten_material(result[\"initial_structure\"]))\n",
"data.extend([result[\"pressure\"], result[\"band_gap_direct\"], result[\"band_gap_indirect\"]])\n",
"table.append(data)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Logic error: Loop body appears to only execute for the last result.

Similar to the previous issue, the for-loop only contains data = flatten_material(...). Lines 597-599 are outside the loop and will only execute once.

🔎 Proposed fix - indent the loop body
 table = []
 for result in results:
     data = flatten_material(result["initial_structure"])
-data.extend(flatten_material(result["initial_structure"]))
-data.extend([result["pressure"], result["band_gap_direct"], result["band_gap_indirect"]])
-table.append(data)
+    data.extend(flatten_material(result["initial_structure"]))
+    data.extend([result["pressure"], result["band_gap_direct"], result["band_gap_indirect"]])
+    table.append(data)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"source": [
"table = []\n",
"for result in results:\n",
" data = flatten_material(result[\"initial_structure\"])\n",
"data.extend(flatten_material(result[\"initial_structure\"]))\n",
"data.extend([result[\"pressure\"], result[\"band_gap_direct\"], result[\"band_gap_indirect\"]])\n",
"table.append(data)"
"source": [
"table = []\n",
"for result in results:\n",
" data = flatten_material(result[\"initial_structure\"])\n",
" data.extend(flatten_material(result[\"initial_structure\"]))\n",
" data.extend([result[\"pressure\"], result[\"band_gap_direct\"], result[\"band_gap_indirect\"]])\n",
" table.append(data)"
🤖 Prompt for AI Agents
In examples/job/run-simulations-and-extract-properties.ipynb around lines
593-599 the for-loop only contains the single statement `data =
flatten_material(result["initial_structure"])` so the subsequent lines that
extend `data` and append it to `table` run only once after the loop; fix by
indenting the lines that extend `data` with pressure and band gap values and the
`table.append(data)` so they are inside the for-loop (i.e., ensure
data.extend(...) and table.append(data) execute for each `result`), preserving
the current variable names and order of operations.

Comment on lines +316 to +329
"outputs": [
{
"ename": "ModuleNotFoundError",
"evalue": "No module named 'matgl'",
"output_type": "error",
"traceback": [
"\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
"\u001B[0;31mModuleNotFoundError\u001B[0m Traceback (most recent call last)",
"Cell \u001B[0;32mIn[2], line 11\u001B[0m\n\u001B[1;32m 9\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mase\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01moptimize\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m BFGS\n\u001B[1;32m 10\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mase\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mcalculators\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01memt\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m EMT\n\u001B[0;32m---> 11\u001B[0m \u001B[38;5;28;01mimport\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mmatgl\u001B[39;00m\n\u001B[1;32m 12\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mmatgl\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mext\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mase\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m M3GNetCalculator\n\u001B[1;32m 13\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01malignn\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mff\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mff\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m AlignnAtomwiseCalculator, default_path, revised_path, wt1_path, wt01_path, \\\n\u001B[1;32m 14\u001B[0m wt10_path\n",
"\u001B[0;31mModuleNotFoundError\u001B[0m: No module named 'matgl'"
]
}
],
"execution_count": 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing dependency prevents notebook execution.

The notebook execution captured a ModuleNotFoundError for the matgl module (line 108), which is required for the M3GNet calculator functionality. This prevents the notebook from running.

Consider one of the following solutions:

  1. Add matgl to dependencies if M3GNet support is required
  2. Make the import optional to gracefully degrade when matgl is unavailable
  3. Remove M3GNET from CALCULATOR_MAP if this calculator isn't needed for this experiment
🔎 Proposed fix to make matgl import optional
 from ase.calculators.emt import EMT
-import matgl
-from matgl.ext.ase import M3GNetCalculator
 from alignn.ff.ff import AlignnAtomwiseCalculator, default_path, revised_path, wt1_path, wt01_path, \
     wt10_path
+
+# Optional matgl import
+try:
+    import matgl
+    from matgl.ext.ase import M3GNetCalculator
+    pot = matgl.load_model("M3GNet-MP-2021.2.8-PES")
+    m3gnet_calculator = M3GNetCalculator(pot)
+    MATGL_AVAILABLE = True
+except ImportError:
+    m3gnet_calculator = None
+    MATGL_AVAILABLE = False
 
 # Type hinting
 from ase import Atoms
@@ -120,8 +129,6 @@
 # Set up the calculators
 model_path = wt1_path()
 alignn_calculator = AlignnAtomwiseCalculator(path=model_path)
-pot = matgl.load_model("M3GNet-MP-2021.2.8-PES")
-m3gnet_calculator = M3GNetCalculator(pot)
 
-OPTIMIZER_MAP = {"BFGS": BFGS}
-CALCULATOR_MAP = {"EMT": EMT(), "ALIGNN": alignn_calculator, "M3GNET": m3gnet_calculator}
+CALCULATOR_MAP = {"EMT": EMT(), "ALIGNN": alignn_calculator}
+if MATGL_AVAILABLE:
+    CALCULATOR_MAP["M3GNET"] = m3gnet_calculator

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In other/experiments/create_interface_with_min_energy_by_miller_indices.ipynb
around lines 316-329, the notebook fails with ModuleNotFoundError for the matgl
import (M3GNet calculator); make the import optional or ensure dependency is
installed: either add matgl to the project's dependencies
(requirements/dev-requirements) if M3GNet support is required, or wrap the matgl
import in a try/except that sets a flag (e.g., HAS_MATGL=False) and
conditionally register or skip the M3GNet entry in CALCULATOR_MAP so the
notebook can run without matgl.

"outputs": [],
"source": [
"from utils.generic import wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies\n",
"from utils.api import wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicate import statement.

This line duplicates the imports already declared at lines 250-251. The functions wait_for_jobs_to_finish and get_property_by_subworkflow_and_unit_indicies are already imported from utils.api earlier in the notebook.

🔎 Suggested fix
-from utils.api import wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"from utils.api import wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies\n",
🤖 Prompt for AI Agents
In other/jarvis/run_job_using_material_from_jarvis_db.ipynb around line 442,
there is a duplicate import "from utils.api import wait_for_jobs_to_finish,
get_property_by_subworkflow_and_unit_indicies" already declared at lines
~250-251; remove this redundant import line (delete the line at 442) and verify
there are no other duplicated imports of these functions elsewhere in the
notebook.

Comment thread utils/api.py
Comment on lines +27 to +33
files = job_endpoint.list_files(job_id)
for file in files:
if file["name"] == filename_on_cloud:
file_metadata = file

# Get a download URL for the CONTCAR
signed_url = file_metadata["signedUrl"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential NameError if file not found.

If filename_on_cloud doesn't match any file in the list, file_metadata is never assigned, causing NameError on line 33. Consider raising an informative error or using a safer pattern.

🔎 Proposed fix
     files = job_endpoint.list_files(job_id)
+    file_metadata = None
     for file in files:
         if file["name"] == filename_on_cloud:
             file_metadata = file
+            break
 
-    # Get a download URL for the CONTCAR
+    if file_metadata is None:
+        raise FileNotFoundError(f"File '{filename_on_cloud}' not found in job {job_id}")
+
     signed_url = file_metadata["signedUrl"]
🤖 Prompt for AI Agents
In utils/api.py around lines 27 to 33, the loop may never assign file_metadata
if filename_on_cloud isn't found, leading to a NameError on accessing
file_metadata["signedUrl"]; update the code to search safely (e.g., use next((f
for f in files if f["name"] == filename_on_cloud), None)) and then check the
result, raising a clear exception (or returning a controlled error) if None
before trying to access signedUrl so the function fails with an informative
message instead of NameError.

Comment thread utils/api.py
Comment on lines +87 to +100
def copy_bank_workflow_by_system_name(endpoint: BankWorkflowEndpoints, system_name: str, account_id: str) -> dict:
"""
Copies a bank workflow with given ID into the account's workflows.

Args:
endpoint (endpoints.bank_workflows.BankWorkflowEndpoints): an instance of BankWorkflowEndpoints class
system_name (str): workflow system name.
account_id (str): ID of account to copy the bank workflow into.

Returns:
dict: new account's workflow
"""
bank_workflow_id = endpoint.list({"systemName": system_name})[0]["_id"]
return endpoint.copy(bank_workflow_id, account_id)["_id"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential IndexError if no workflow matches the system name.

If endpoint.list(...) returns an empty list, accessing [0] raises IndexError. Consider adding a guard.

🔎 Proposed fix
 def copy_bank_workflow_by_system_name(endpoint: BankWorkflowEndpoints, system_name: str, account_id: str) -> dict:
     ...
-    bank_workflow_id = endpoint.list({"systemName": system_name})[0]["_id"]
+    workflows = endpoint.list({"systemName": system_name})
+    if not workflows:
+        raise ValueError(f"No bank workflow found with systemName: {system_name}")
+    bank_workflow_id = workflows[0]["_id"]
     return endpoint.copy(bank_workflow_id, account_id)["_id"]
🤖 Prompt for AI Agents
In utils/api.py around lines 87 to 100, the code assumes
endpoint.list({"systemName": system_name}) always returns at least one item
which can raise IndexError; guard by capturing the list result into a variable,
check if it's empty and raise a clear ValueError (or custom exception) including
the system_name (and account_id if helpful) before attempting to access [0];
then proceed to call endpoint.copy(...) and return the expected value. Also
update the docstring return description to match what the function actually
returns (ID string vs dict).

Comment thread utils/api.py
Comment on lines +103 to +105
def get_property_by_subworkflow_and_unit_indicies(
endpoint: PropertiesEndpoints, property_name: str, job: dict, subworkflow_index: int, unit_index: int
) -> dict:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in function name: "indicies" should be "indices".

The function is named get_property_by_subworkflow_and_unit_indicies but the correct spelling is "indices". This affects the public API surface.

🔎 Proposed fix
-def get_property_by_subworkflow_and_unit_indicies(
+def get_property_by_subworkflow_and_unit_indices(
     endpoint: PropertiesEndpoints, property_name: str, job: dict, subworkflow_index: int, unit_index: int
 ) -> dict:

Note: If this function is already in use elsewhere with the misspelled name, you may need to update all call sites or provide an alias for backward compatibility.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_property_by_subworkflow_and_unit_indicies(
endpoint: PropertiesEndpoints, property_name: str, job: dict, subworkflow_index: int, unit_index: int
) -> dict:
def get_property_by_subworkflow_and_unit_indices(
endpoint: PropertiesEndpoints, property_name: str, job: dict, subworkflow_index: int, unit_index: int
) -> dict:
🤖 Prompt for AI Agents
In utils/api.py around lines 103 to 105, the function name
get_property_by_subworkflow_and_unit_indicies contains a typo ("indicies" ->
"indices"); rename the function to get_property_by_subworkflow_and_unit_indices
and update all internal call sites and imports to use the corrected name, or if
backward compatibility is required, keep the old name as a thin wrapper that
calls the new function and emits a deprecation warning so external consumers are
not broken.

Comment thread utils/api.py
Comment on lines +123 to +125
def get_cluster_name(name: str = "cluster-001") -> str:
clusters = json.loads(os.environ.get("CLUSTERS", "[]") or "[]")
return clusters[0] if clusters else name
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return type mismatch: function returns cluster object, not cluster name.

The function is named get_cluster_name and documented to return a name string, but clusters[0] returns the entire cluster object (a dict), not just the name.

🔎 Proposed fix if cluster objects have a name field
 def get_cluster_name(name: str = "cluster-001") -> str:
     clusters = json.loads(os.environ.get("CLUSTERS", "[]") or "[]")
-    return clusters[0] if clusters else name
+    return clusters[0].get("displayName", name) if clusters else name

Alternatively, verify what structure CLUSTERS contains and adjust accordingly.

🤖 Prompt for AI Agents
In utils/api.py around lines 123 to 125, get_cluster_name currently returns
clusters[0] which is the entire cluster object (dict) instead of a name string;
change the return to extract and return the name string (e.g. if clusters[0] is
a dict return clusters[0].get("name", name)), and fall back to the provided
default name when CLUSTERS is empty or when the object has no name field; ensure
the function always returns a string.

@VsevolodX VsevolodX merged commit c158bfe into main Dec 27, 2025
5 checks passed
@VsevolodX VsevolodX deleted the feature/SOF-7782 branch December 27, 2025 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants