Skip to content

Feature/SOF-7782-1 Chore: necessary update to separate generic and api utils#262

Merged
VsevolodX merged 3 commits into
feature/SOF-7782from
feature/SOF-7782-1
Dec 27, 2025
Merged

Feature/SOF-7782-1 Chore: necessary update to separate generic and api utils#262
VsevolodX merged 3 commits into
feature/SOF-7782from
feature/SOF-7782-1

Conversation

@VsevolodX
Copy link
Copy Markdown
Member

@VsevolodX VsevolodX commented Dec 27, 2025

Separate API (exabyte-api-client) related functions, visualization (IPython/ipyidgets) related functions from truly generic ones that use built-in libs.
to allow NBs to work independently

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 27, 2025

📝 Walkthrough

Walkthrough

This PR refactors utility modules by relocating functions from utils/generic.py to specialized modules: utils/api.py for API-related operations and utils/visualize.py for visualization tasks. Example notebooks and scripts are updated to import from the new module locations. A new thermal disorder notebook and documentation updates are included.

Changes

Cohort / File(s) Summary
Core Utility Modules
utils/api.py, utils/generic.py, utils/visualize.py
Created new utils/api.py with six API helper functions (save_files, get_jobs_statuses_by_ids, wait_for_jobs_to_finish, copy_bank_workflow_by_system_name, get_property_by_subworkflow_and_unit_indicies, get_cluster_name). Expanded utils/visualize.py with dataframe_to_html and display_JSON. Reduced utils/generic.py, retaining only dict_to_namespace.
Job Examples
examples/job/create_and_submit_job.ipynb, examples/job/get-file-from-job.ipynb, examples/job/ml-train-model-predict-properties.ipynb, examples/job/run-simulations-and-extract-properties.ipynb
Updated imports: utils.genericutils.api for API functions; utils.genericutils.visualize for visualization. Normalized cell metadata (execution_count, outputs arrays).
Material Examples
examples/material/create_material.ipynb, examples/material/get_materials_by_formula.ipynb, examples/material/upload_materials_from_file_poscar.ipynb
Redirected display_JSON imports from utils.generic to utils.visualize. Standardized notebook cell metadata structures.
Publication Examples
examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb, examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
Reorganized imports to use utils.api for API utilities and utils.visualize for display functions. Aligned cell metadata conventions.
Workflow & System Examples
examples/workflow/get_workflows.ipynb, examples/workflow/qe_scf_calculation.ipynb, examples/system/get_authentication_params.ipynb
Updated import paths for utilities. Normalized execution metadata across cells. Enhanced initialization flows and post-processing steps (e.g., energy extraction, inline plotting).
Other Examples & Scripts
other/experiments/create_interface_with_min_energy_by_miller_indices.ipynb, other/jarvis/run_job_using_material_from_jarvis_db.ipynb, other/materials_designer/workflows/band_gap.ipynb, other/materialsproject/api_interoperability_showcase.ipynb, other/materialsproject/api_interoperability_showcase.py, other/materialsproject/import_materials_from_materialsproject.ipynb, other/materialsproject/import_materials_from_materialsproject.py, other/webinar/generate-al2o3-slab-structures.py, other/webinar/wulff-construction-surface-energy-study-cu.py
Unified import refactoring to use utils.api for API operations and utils.visualize for visualization. Removed obsolete ensure_packages_are_installed calls. Added execution metadata to tracked cells.
Materials Designer
other/materials_designer/Introduction.ipynb
Added documentation subsection link for "5.2. Thermal Disorder" with Maxwell-Boltzmann thermal disorder notebook reference.
New Materials Designer Notebook
other/materials_designer/create_maxwell_disorder.ipynb
New notebook implementing Maxwell-Boltzmann thermal disorder creation: environment setup, material retrieval, optional supercell visualization, disorder parameter application, result export.
Materials Designer Data
other/materials_designer/uploads/0-Ni.json
Updated metadata field: external.source from "Materials Project" to "MaterialsProject".

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • timurbazhirov

Poem

🐰 Hoppy refactoring, so neat and so clean,
From generic to homes where they're meant to be seen—
visualize sparkles, api takes flight,
Our imports now organized, oh what a sight!
Like organizing carrots by color and size, 🥕
The code hops along, a utility reprise!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: separating generic and api utilities into distinct modules (utils.generic, utils.api, utils.visualize) across the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/SOF-7782-1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@VsevolodX VsevolodX changed the base branch from main to feature/SOF-7782 December 27, 2025 03:07
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
utils/api.py (2)

58-84: Consider adding a timeout mechanism.

The function polls indefinitely until all jobs finish. Consider adding an optional timeout parameter to prevent infinite loops if jobs become stuck or encounter unexpected states.

🔎 Suggested enhancement
-def wait_for_jobs_to_finish(endpoint: JobEndpoints, job_ids: list, poll_interval: int = 10) -> None:
+def wait_for_jobs_to_finish(endpoint: JobEndpoints, job_ids: list, poll_interval: int = 10, timeout: int = None) -> None:
     """
     Waits for jobs to finish and prints their statuses.
     A job is considered finished if it is not in "pre-submission", "submitted", or, "active" status.

     Args:
         endpoint (JobEndpoints): Job endpoint object from the Exabyte API Client
         job_ids (list): list of job IDs to wait for
         poll_interval (int): poll interval for job information in seconds. Defaults to 10.
+        timeout (int): maximum time to wait in seconds. None for no timeout.
     """
     print("Wait for jobs to finish, poll interval: {0} sec".format(poll_interval))
+    start_time = time.time()
     while True:
         statuses = get_jobs_statuses_by_ids(endpoint, job_ids)

         errored_jobs = len([status for status in statuses if status == "error"])
         active_jobs = len([status for status in statuses if status == "active"])
         finished_jobs = len([status for status in statuses if status == "finished"])
         submitted_jobs = len([status for status in statuses if status == "submitted"])

         headers = ["TIME", "SUBMITTED-JOBS", "ACTIVE-JOBS", "FINISHED-JOBS", "ERRORED-JOBS"]
         now = datetime.datetime.now().strftime("%Y-%m-%d-%H:%M:%S")
         row = [now, submitted_jobs, active_jobs, finished_jobs, errored_jobs]
         print(tabulate([row], headers, tablefmt="grid", stralign="center"))

         if all([status not in ["pre-submission", "submitted", "active"] for status in statuses]):
             break
+        
+        if timeout and (time.time() - start_time) > timeout:
+            raise TimeoutError(f"Jobs did not finish within {timeout} seconds")
+        
         time.sleep(poll_interval)

123-125: Consider adding error handling for invalid JSON.

If the CLUSTERS environment variable contains malformed JSON, json.loads will raise a JSONDecodeError. While this may be acceptable if the environment is expected to be well-formed, consider adding error handling for robustness.

🔎 Optional enhancement
 def get_cluster_name(name: str = "cluster-001") -> str:
-    clusters = json.loads(os.environ.get("CLUSTERS", "[]") or "[]")
+    try:
+        clusters = json.loads(os.environ.get("CLUSTERS", "[]") or "[]")
+    except json.JSONDecodeError:
+        clusters = []
     return clusters[0] if clusters else name
other/jarvis/run_job_using_material_from_jarvis_db.ipynb (1)

442-442: Remove duplicate import statement.

These functions (wait_for_jobs_to_finish and get_property_by_subworkflow_and_unit_indicies) are already imported at lines 250-251. This duplicate import is unnecessary and should be removed.

🔎 Proposed fix
-from utils.api import wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies
 job_id = job["_id"]
 wait_for_jobs_to_finish(job_endpoints, [job_id])
utils/visualize.py (1)

77-77: Optional: Use explicit conversion flag in f-string.

The static analysis tool suggests using an explicit conversion flag for better clarity.

🔎 Proposed fix
                 f"<script>{js} "
                 f"renderjson.set_show_to_level({str(level)}); "
-                f'renderjson.set_icons("▸","▾"); '
+                f'renderjson.set_icons("▸","▾"); '
                 f'document.getElementById("{id}").appendChild(renderjson({json_str}))</script>'

Alternatively, you can use !s conversion flag:

-                f"renderjson.set_show_to_level({str(level)}); "
+                f"renderjson.set_show_to_level({level!s}); "
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2dc2025 and a0bdbd9.

📒 Files selected for processing (27)
  • examples/job/create_and_submit_job.ipynb
  • examples/job/get-file-from-job.ipynb
  • examples/job/ml-train-model-predict-properties.ipynb
  • examples/job/run-simulations-and-extract-properties.ipynb
  • examples/material/create_material.ipynb
  • examples/material/get_materials_by_formula.ipynb
  • examples/material/upload_materials_from_file_poscar.ipynb
  • examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb
  • examples/system/get_authentication_params.ipynb
  • examples/workflow/get_workflows.ipynb
  • examples/workflow/qe_scf_calculation.ipynb
  • other/experiments/create_interface_with_min_energy_by_miller_indices.ipynb
  • other/jarvis/run_job_using_material_from_jarvis_db.ipynb
  • other/materials_designer/Introduction.ipynb
  • other/materials_designer/create_maxwell_disorder.ipynb
  • other/materials_designer/uploads/0-Ni.json
  • other/materials_designer/workflows/band_gap.ipynb
  • other/materialsproject/api_interoperability_showcase.ipynb
  • other/materialsproject/api_interoperability_showcase.py
  • other/materialsproject/import_materials_from_materialsproject.ipynb
  • other/materialsproject/import_materials_from_materialsproject.py
  • other/webinar/generate-al2o3-slab-structures.py
  • other/webinar/wulff-construction-surface-energy-study-cu.py
  • utils/api.py
  • utils/generic.py
  • utils/visualize.py
💤 Files with no reviewable changes (1)
  • utils/generic.py
🧰 Additional context used
🧬 Code graph analysis (4)
other/materialsproject/api_interoperability_showcase.py (1)
utils/visualize.py (1)
  • display_JSON (44-83)
other/webinar/wulff-construction-surface-energy-study-cu.py (1)
utils/api.py (1)
  • save_files (14-40)
other/webinar/generate-al2o3-slab-structures.py (2)
utils/api.py (1)
  • save_files (14-40)
utils/visualize.py (1)
  • display_JSON (44-83)
other/materialsproject/import_materials_from_materialsproject.py (1)
utils/visualize.py (1)
  • display_JSON (44-83)
🪛 Ruff (0.14.10)
examples/job/ml-train-model-predict-properties.ipynb

191-191: Found useless expression. Either assign it to a variable or remove it.

(B018)

examples/system/get_authentication_params.ipynb

49-49: await statement outside of a function

(F704)

examples/job/run-simulations-and-extract-properties.ipynb

188-188: Found useless expression. Either assign it to a variable or remove it.

(B018)

other/experiments/create_interface_with_min_energy_by_miller_indices.ipynb

315-315: Ambiguous variable name: l

(E741)

other/materials_designer/create_maxwell_disorder.ipynb

23-23: await statement outside of a function

(F704)


24-24: await statement outside of a function

(F704)


27-27: await statement outside of a function

(F704)

examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb

37-37: await statement outside of a function

(F704)


338-338: Undefined name display

(F821)

utils/visualize.py

77-77: Use explicit conversion flag

Replace with conversion flag

(RUF010)

utils/api.py

36-36: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

🔇 Additional comments (31)
other/materialsproject/import_materials_from_materialsproject.py (1)

56-56: LGTM! Import path correctly updated.

The import of display_JSON has been properly migrated from utils.generic to utils.visualize, aligning with the PR's refactoring objectives.

examples/job/ml-train-model-predict-properties.ipynb (1)

143-151: LGTM! Imports properly reorganized.

The imports have been correctly split and migrated:

  • API utilities (copy_bank_workflow_by_system_name, wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies) → utils.api
  • Visualization utilities (dataframe_to_html, display_JSON) → utils.visualize

This logical grouping improves code organization and aligns with the PR's refactoring objectives.

utils/api.py (1)

43-55: LGTM!

The function correctly queries job statuses using the API endpoint and returns the expected list of status strings.

other/webinar/wulff-construction-surface-energy-study-cu.py (1)

27-27: LGTM! Import updated correctly.

The import has been properly migrated from the bootstrap mechanism to a direct import from utils.api, simplifying the dependency setup.

other/materialsproject/api_interoperability_showcase.py (1)

58-58: LGTM! Import path correctly updated.

The import of display_JSON has been properly migrated from utils.generic to utils.visualize, consistent with the refactoring objectives.

other/materialsproject/api_interoperability_showcase.ipynb (1)

125-125: LGTM! Import path correctly updated.

The import of display_JSON has been properly migrated from utils.generic to utils.visualize, consistent with the refactoring objectives and matching the corresponding Python file.

other/materials_designer/uploads/0-Ni.json (1)

57-57: This change creates an inconsistency rather than a standardization.

The source value in 0-Ni.json uses "MaterialsProject", but other material JSON files in the codebase (e.g., other/experiments/uploads/Gr.json and other/experiments/uploads/Ni.json) use "materials project" (lowercase with space). Either align this value with the existing convention in the experiments folder, or ensure all source references across the entire codebase use a unified naming pattern.

Likely an incorrect or invalid review comment.

examples/material/upload_materials_from_file_poscar.ipynb (1)

122-122: LGTM! Import path updated correctly.

The import has been successfully updated to use the new utils.visualize module, aligning with the PR's objective to separate visualization utilities from generic utilities.

examples/job/create_and_submit_job.ipynb (1)

108-108: LGTM! Import path updated correctly.

Consistent with the refactoring effort to relocate display_JSON to the specialized utils.visualize module.

other/materials_designer/Introduction.ipynb (1)

85-87: LGTM! Documentation updated for new thermal disorder feature.

The new subsection correctly documents the Maxwell-Boltzmann thermal disorder notebook and follows the existing documentation structure.

examples/workflow/get_workflows.ipynb (1)

123-123: LGTM! Import path updated correctly.

Consistent with the repository-wide refactoring to use utils.visualize.display_JSON.

other/materialsproject/import_materials_from_materialsproject.ipynb (1)

98-98: LGTM! Import path updated correctly.

The import has been appropriately updated to align with the new module structure.

other/jarvis/run_job_using_material_from_jarvis_db.ipynb (1)

250-251: LGTM! Import paths updated correctly.

The imports have been successfully refactored to use the new utils.api and utils.visualize modules.

other/webinar/generate-al2o3-slab-structures.py (1)

35-36: Verify that package availability checks are handled.

The imports have been correctly updated to use utils.api and utils.visualize. However, the AI summary indicates that ensure_packages_are_installed() was removed. Please verify that any necessary package availability checks are still performed, or confirm that this check is no longer needed for this script.

examples/reproducing_publications/band_structure_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (4)

287-287: LGTM! Import path updated correctly.

The import has been appropriately updated to use utils.visualize.display_JSON.


398-399: LGTM! Import paths updated correctly.

Both imports have been successfully refactored to use the new specialized modules (utils.visualize for display_JSON and utils.api for copy_bank_workflow_by_system_name).


699-701: LGTM! Minor formatting improvement.

The markdown header formatting has been updated to use list syntax, which is a minor stylistic improvement.


749-749: LGTM! Import path updated correctly.

The import has been appropriately updated to use utils.api.wait_for_jobs_to_finish.

examples/material/create_material.ipynb (1)

118-118: LGTM! Import path updated correctly.

The import of display_JSON has been correctly updated from utils.generic to utils.visualize, aligning with the module reorganization in this PR.

other/materials_designer/workflows/band_gap.ipynb (1)

443-443: LGTM! Import paths updated correctly.

The imports have been appropriately refactored to reflect the new module structure:

  • API-related functions (get_cluster_name, wait_for_jobs_to_finish) now imported from utils.api
  • Visualization function (display_JSON) now imported from utils.visualize

This separation of concerns improves code organization.

Also applies to: 487-487, 536-536

examples/job/get-file-from-job.ipynb (1)

142-149: LGTM! Import paths updated correctly.

The imports have been appropriately reorganized:

  • API utilities (wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies) from utils.api
  • Visualization utilities (dataframe_to_html, display_JSON) from utils.visualize

This aligns with the PR's module restructuring objective.

other/materials_designer/create_maxwell_disorder.ipynb (1)

1-236: LGTM! Well-structured new notebook.

This notebook provides a clear workflow for applying Maxwell-Boltzmann thermal disorder to materials with:

  • Well-documented parameters and usage instructions
  • Proper environment setup for Pyodide
  • Optional supercell creation before disorder application
  • Visualization of results
  • Data export to outer runtime

Note: The static analysis hints about await statements outside of a function are false positives—these are valid in Jupyter notebooks running in Pyodide/JupyterLite environments.

examples/material/get_materials_by_formula.ipynb (1)

119-119: LGTM! Import path updated correctly.

The import of display_JSON has been correctly updated from utils.generic to utils.visualize, consistent with the module reorganization.

examples/system/get_authentication_params.ipynb (1)

134-134: LGTM! Import path updated correctly.

The import of display_JSON has been correctly updated from utils.generic to utils.visualize. The static analysis hint about await outside of a function at line 49 is a false positive—this is valid in Jupyter notebooks running in Pyodide environments.

examples/job/run-simulations-and-extract-properties.ipynb (1)

153-154: LGTM! Import paths updated correctly.

The imports have been appropriately reorganized:

  • API utilities (wait_for_jobs_to_finish, get_property_by_subworkflow_and_unit_indicies) from utils.api
  • Visualization utility (dataframe_to_html) from utils.visualize

This aligns with the PR's module restructuring to separate API and visualization concerns.

utils/visualize.py (2)

28-41: LGTM! Clean implementation.

The dataframe_to_html function provides a straightforward way to style Pandas DataFrames with configurable text alignment. The implementation is clear and follows pandas styling conventions.


44-83: The web assets renderjson.css and renderjson.js are present in utils/web/ and properly included in the repository.

examples/workflow/qe_scf_calculation.ipynb (1)

146-146: LGTM! The import path refactoring from utils.generic to utils.api is correct. The function wait_for_jobs_to_finish exists in the new module location with a signature that matches the usage at line 340: wait_for_jobs_to_finish(job_endpoints, [JOB_RESP["_id"]], 30).

examples/reproducing_publications/band_gaps_for_interface_bilayer_twisted_molybdenum_disulfide.ipynb (3)

288-288: The line numbers referenced in this review are incorrect. Line 288 contains plt.xlim(-2, 62), not an import statement. The actual imports of display_JSON and copy_bank_workflow_by_system_name appear at lines 107, 131-132 (in different cells of the Jupyter notebook). Both functions are correctly imported from their respective modules (utils.visualize and utils.api) and are properly used in the notebook.

Likely an incorrect or invalid review comment.


762-762: API utility imports are correctly organized.

Both functions are properly imported from utils.api:

  • wait_for_jobs_to_finish (defined in utils/api.py at line 58)
  • get_property_by_subworkflow_and_unit_indicies (defined in utils/api.py at line 103)

Both functions are used in the notebook and the imports are in the correct location.


114-114: The import statement location reference is incorrect. Line 114 in the notebook does not contain from utils.visualize import visualize_materials. The import statement does exist in the notebook and is correctly formatted, and both visualize_materials (line 267 in utils/visualize.py) and display_JSON (line 44 in utils/visualize.py) functions are properly defined in the utils/visualize.py module. However, the specific line number cited in the review comment does not match the actual notebook content.

Likely an incorrect or invalid review comment.

Comment on lines +309 to +329
"collapsed": false,
"ExecuteTime": {
"end_time": "2025-12-27T00:25:07.133081Z",
"start_time": "2025-12-27T00:25:05.723873Z"
}
},
"id": "2b17b3465167e5e1",
"execution_count": null
"outputs": [
{
"ename": "ModuleNotFoundError",
"evalue": "No module named 'matgl'",
"output_type": "error",
"traceback": [
"\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
"\u001B[0;31mModuleNotFoundError\u001B[0m Traceback (most recent call last)",
"Cell \u001B[0;32mIn[2], line 11\u001B[0m\n\u001B[1;32m 9\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mase\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01moptimize\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m BFGS\n\u001B[1;32m 10\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mase\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mcalculators\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01memt\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m EMT\n\u001B[0;32m---> 11\u001B[0m \u001B[38;5;28;01mimport\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mmatgl\u001B[39;00m\n\u001B[1;32m 12\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01mmatgl\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mext\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mase\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m M3GNetCalculator\n\u001B[1;32m 13\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;21;01malignn\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mff\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mff\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m AlignnAtomwiseCalculator, default_path, revised_path, wt1_path, wt01_path, \\\n\u001B[1;32m 14\u001B[0m wt10_path\n",
"\u001B[0;31mModuleNotFoundError\u001B[0m: No module named 'matgl'"
]
}
],
"execution_count": 2
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing dependency error captured in notebook output.

The notebook execution output shows ModuleNotFoundError: No module named 'matgl'. This suggests the notebook was executed in an environment missing required dependencies. Consider either:

  1. Adding matgl to the project's dependency list
  2. Clearing notebook outputs before committing to avoid capturing transient execution errors
🧰 Tools
🪛 Ruff (0.14.10)

315-315: Ambiguous variable name: l

(E741)

🤖 Prompt for AI Agents
other/experiments/create_interface_with_min_energy_by_miller_indices.ipynb
around lines 309-329: the notebook output shows ModuleNotFoundError: No module
named 'matgl' because the execution captured a missing dependency; fix by adding
matgl to the project's dependency list (e.g., requirements.txt, pyproject.toml
or environment.yml) and/or the CI/test environment so the notebook can run, and
remove or clear executed output cells from the committed .ipynb to avoid
shipping transient execution errors.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timurbazhirov, thanks for acknowledging! Let me know if you need any help addressing the missing dependency or clearing the notebook outputs.

Comment thread utils/api.py
Comment on lines +14 to +40
def save_files(job_id: str, job_endpoint: JobEndpoints, filename_on_cloud: str, filename_on_disk: str) -> None:
"""
Saves a file to disk, overwriting any files with the same name as filename_on_disk

Args:
job_id (str): ID of the job
job_endpoint (JobEndpoints): Job endpoint object from the Exabyte API Client
filename_on_cloud (str): Name of the file on the server
filename_on_disk (str): Name the file will be saved to

Returns:
None
"""
files = job_endpoint.list_files(job_id)
for file in files:
if file["name"] == filename_on_cloud:
file_metadata = file

# Get a download URL for the CONTCAR
signed_url = file_metadata["signedUrl"]

# Download the contcar to memory
server_response = urllib.request.urlopen(signed_url)

# Write it to disk
with open(filename_on_disk, "wb") as outp:
outp.write(server_response.read())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix potential UnboundLocalError when file not found.

If filename_on_cloud does not match any file in the list, file_metadata will be uninitialized, causing an UnboundLocalError at line 33. Add error handling for this case.

🔎 Proposed fix
 def save_files(job_id: str, job_endpoint: JobEndpoints, filename_on_cloud: str, filename_on_disk: str) -> None:
     """
     Saves a file to disk, overwriting any files with the same name as filename_on_disk

     Args:
         job_id (str): ID of the job
         job_endpoint (JobEndpoints): Job endpoint object from the Exabyte API Client
         filename_on_cloud (str): Name of the file on the server
         filename_on_disk (str): Name the file will be saved to

     Returns:
         None
     """
     files = job_endpoint.list_files(job_id)
+    file_metadata = None
     for file in files:
         if file["name"] == filename_on_cloud:
             file_metadata = file
+            break
+    
+    if file_metadata is None:
+        raise FileNotFoundError(f"File '{filename_on_cloud}' not found in job {job_id}")

     # Get a download URL for the CONTCAR
     signed_url = file_metadata["signedUrl"]

     # Download the contcar to memory
     server_response = urllib.request.urlopen(signed_url)

     # Write it to disk
     with open(filename_on_disk, "wb") as outp:
         outp.write(server_response.read())
🧰 Tools
🪛 Ruff (0.14.10)

36-36: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

🤖 Prompt for AI Agents
In utils/api.py around lines 14 to 40, file_metadata may be referenced before
assignment if filename_on_cloud is not found; initialize a sentinel (e.g.,
file_metadata = None) before the loop, set it when a match is found, and after
the loop check if it is still None — if so raise a clear FileNotFoundError (or
return/handle) with job_id and filename_on_cloud included; otherwise proceed to
use file_metadata as before.

Comment thread utils/api.py
Comment on lines +87 to +100
def copy_bank_workflow_by_system_name(endpoint: BankWorkflowEndpoints, system_name: str, account_id: str) -> dict:
"""
Copies a bank workflow with given ID into the account's workflows.

Args:
endpoint (endpoints.bank_workflows.BankWorkflowEndpoints): an instance of BankWorkflowEndpoints class
system_name (str): workflow system name.
account_id (str): ID of account to copy the bank workflow into.

Returns:
dict: new account's workflow
"""
bank_workflow_id = endpoint.list({"systemName": system_name})[0]["_id"]
return endpoint.copy(bank_workflow_id, account_id)["_id"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add error handling for missing workflow.

If no workflow exists with the given system_name, accessing [0] on line 99 will raise an IndexError. Add validation to handle this case gracefully.

🔎 Proposed fix
 def copy_bank_workflow_by_system_name(endpoint: BankWorkflowEndpoints, system_name: str, account_id: str) -> dict:
     """
     Copies a bank workflow with given ID into the account's workflows.

     Args:
         endpoint (endpoints.bank_workflows.BankWorkflowEndpoints): an instance of BankWorkflowEndpoints class
         system_name (str): workflow system name.
         account_id (str): ID of account to copy the bank workflow into.

     Returns:
         dict: new account's workflow
     """
-    bank_workflow_id = endpoint.list({"systemName": system_name})[0]["_id"]
+    workflows = endpoint.list({"systemName": system_name})
+    if not workflows:
+        raise ValueError(f"No bank workflow found with system name: {system_name}")
+    bank_workflow_id = workflows[0]["_id"]
     return endpoint.copy(bank_workflow_id, account_id)["_id"]
🤖 Prompt for AI Agents
In utils/api.py around lines 87 to 100, the call endpoint.list({"systemName":
system_name})[0] will raise IndexError if no workflow matches; update the
function to first capture the list result, validate it's non-empty, and handle
the empty case gracefully (either raise a clear ValueError/LookupError with a
descriptive message including system_name and account_id, or return
None/explicit error dict per project convention). After validation, proceed to
extract the _id and call endpoint.copy(...) and return its _id; ensure no bare
IndexError can propagate and include a helpful error message for debugging.

Comment thread utils/api.py
Comment on lines +103 to +120
def get_property_by_subworkflow_and_unit_indicies(
endpoint: PropertiesEndpoints, property_name: str, job: dict, subworkflow_index: int, unit_index: int
) -> dict:
"""
Returns the property extracted in the given unit of the job's subworkflow.

Args:
endpoint (endpoints.properties.PropertiesEndpoints): an instance of PropertiesEndpoints class.
property_name (str): name of property to extract.
job (dict): job config to extract the property from.
subworkflow_index (int): index of subworkflow to extract the property from.
unit_index (int): index of unit to extract the property from.

Returns:
dict: extracted property
"""
unit_flowchart_id = job["workflow"]["subworkflows"][subworkflow_index]["units"][unit_index]["flowchartId"]
return endpoint.get_property(job["_id"], unit_flowchart_id, property_name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add bounds checking for subworkflow and unit indices.

The function accesses nested indices without validation. If subworkflow_index or unit_index are out of bounds, this will raise an IndexError or KeyError. Consider adding validation or wrapping in a try-except with a helpful error message.

🔎 Suggested improvement
 def get_property_by_subworkflow_and_unit_indicies(
     endpoint: PropertiesEndpoints, property_name: str, job: dict, subworkflow_index: int, unit_index: int
 ) -> dict:
     """
     Returns the property extracted in the given unit of the job's subworkflow.

     Args:
         endpoint (endpoints.properties.PropertiesEndpoints): an instance of PropertiesEndpoints class.
         property_name (str): name of property to extract.
         job (dict): job config to extract the property from.
         subworkflow_index (int): index of subworkflow to extract the property from.
         unit_index (int): index of unit to extract the property from.

     Returns:
         dict: extracted property
     """
+    try:
+        subworkflow = job["workflow"]["subworkflows"][subworkflow_index]
+        unit = subworkflow["units"][unit_index]
+        unit_flowchart_id = unit["flowchartId"]
+    except (IndexError, KeyError) as e:
+        raise ValueError(
+            f"Invalid indices: subworkflow={subworkflow_index}, unit={unit_index}. "
+            f"Job has {len(job.get('workflow', {}).get('subworkflows', []))} subworkflows."
+        ) from e
-    unit_flowchart_id = job["workflow"]["subworkflows"][subworkflow_index]["units"][unit_index]["flowchartId"]
     return endpoint.get_property(job["_id"], unit_flowchart_id, property_name)
🤖 Prompt for AI Agents
In utils/api.py around lines 103 to 120, the function accesses nested
job["workflow"]["subworkflows"][subworkflow_index]["units"][unit_index]["flowchartId"]
without validating indices or keys, which can raise IndexError/KeyError; add
explicit bounds and key checks (e.g. ensure job contains "workflow" and
"subworkflows", that subworkflow_index is within range, that the selected
subworkflow contains "units" and unit_index is within range) and if any check
fails raise a clear ValueError (or custom exception) that includes the job id
and the provided indices; alternatively wrap the extraction in a try/except
catching IndexError/KeyError and re-raise a ValueError with a descriptive
message about which index/key was invalid and the job _id so callers get a
helpful error instead of a raw traceback.

@VsevolodX VsevolodX merged commit 3293d7f into feature/SOF-7782 Dec 27, 2025
5 checks passed
@VsevolodX VsevolodX deleted the feature/SOF-7782-1 branch December 27, 2025 08:13
@coderabbitai coderabbitai Bot mentioned this pull request May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants