Job rework #281

Gautzilla · 2025-09-22T09:29:33Z

🐳 What's new?

This PR aims at lightening the job module in order to facilitate the configuration of servers created by each task.
It also allows to export Core API datasets on datarmor through jobs.

🐳 How to use it??

🐬 Public API

Simply set the requested server config in a JobConfig instance, and attach a JobBuilder with the specified config to the public API Dataset:

from osekit.utils.job import JobConfig, JobBuilder
from osekit.public_api.dataset import Dataset

dataset = Dataset(...) # See the Dataset documentation

job_config = JobConfig(
    nb_nodes=1, # Number of nodes on which the job runs
    ncpus=28, # Number of total cores used per node
    mem="60gb", # Maximum amount of physical memory used by the job
    walltime=Timedelta(hours=5), # Maximum amount of real itime during which the job can be running
    venv_name=os.environ["CONDA_DEFAULT_ENV"], # Works only for conda venvs
    queue="omp" # Queue in which the job will be submitted
)

dataset.job_builder = JobBuilder(
    config=job_config,
)

# Now the dataset has a non-None job_builder attribute,
# running an analysis will write a PBS file in the logs directory
# and submit it to the requested queue.

dataset.run_analysis(...) # See the Analysis documentation

🐬 Core API

For exporting Core API datasets through jobs, the Job instances must be created manually, with a link to the export_analysis script and a specified list of arguments for this script.

Here is the example from the doc:

import os

from osekit.core_api.spectro_dataset import SpectroDataset
from osekit.core_api.audio_dataset import AudioDataset
from osekit.utils.job import JobConfig, Job

# Some Public API imports are required
from osekit.public_api.analysis import AnalysisType
from osekit.public_api import export_analysis

ads = AudioDataset(...) # See the AudioDataset doc
sds = SpectroDataset(...) # See the SpectroDataset doc

# We must specify the folder in which the files will be exported
# This is an example with both audio and spectro exports.
ads.folder = Path(...)
sds.folder = Path(...)

# Datasets must be serialized
ads.write_json(ads.foler/"output")
sds.write_json(sds.foler/"output")

# Export specifications
# All parameters are listed in this example, but all parameters other than analysis have default values
args = {
    "analysis": (AnalysisType.AUDIO|AnalysisType.SPECTROGRAM).value,
    "ads-json": ads.foler/"output"/f"{ads.name}.json",
    "sds-json": sds.foler/"output"/f"{sds.name}.json",
    "subtype": "FLOAT",
    "matrix-folder-path": "None", # Folder in which npz matrices are exported
    "spectrogram-folder-path": sds.folder/"output", # Folder in which png spectrograms are exported
    "welch-folder-path": "None",  # Folder in which npz welch matrices are exported
    "first": 0, # First data of the dataset to be exported
    "last": -1, # Last data of the dataset to be exported
    "downsampling-quality": "HQ",
    "upsampling-quality": "VHQ",
    "umask": 0o022,
    "tqdm-disable": "False", # Disable TQDM progress bars
    "multiprocessing": "True",
    "nb-processes": "None",  # Should be a string. "None" uses the max number of processes, otherwise e.g. "3" will use 3.
    "use-logging-setup": "True", # Call osekit.setup_logging() before exporting the dataset.
}

# Job and server configuration
job_config = JobConfig(
    nb_nodes=1,
    ncpus=28,
    mem="60gb",
    walltime=Timedelta(hours=1),
    venv_name=os.environ["CONDA_DEFAULT_ENV"],
    queue="omp"
)

job = Job(
    script_path = Path(export_analysis.__file__),
    script_args=args,
    config=job_config,
    name="test_job_core",
    output_folder=Path(...), # Path in which the .out and .err files are written
)

# Write the PBS file and submit the job
job.write_pbs(Path(...) / f"{job.name}.pbs")
job.submit_pbs()

🐡 Outro

I'm not sure the documentation is clear enough, please tell me if this is too clumsy to use!!

mathieudpnt

need to change default value last in script_args from Job instance

Gautzilla · 2025-11-18T15:30:57Z

need to change default value last in script_args from Job instance

It should be solved with the last commit, do I have your green light?

mathieudpnt

looking good !

review was stale

Gautzilla added 5 commits September 18, 2025 14:14

setup logging in export analysis script

2db8aa5

add new Job class init

51eda0c

add Job.progress() tests

122cf0f

add Job.write_pbs() method

4ec7a86

add JobStatus.UNPREPARED

cd1249f

Gautzilla requested a review from mathieudpnt September 22, 2025 09:29

Gautzilla self-assigned this Sep 22, 2025

Gautzilla added the job monitoring Work related to monitoring of the jobs label Sep 22, 2025

Gautzilla and others added 22 commits September 30, 2025 16:16

add JobConfig dataclass

426d715

add submit_pbs method

58bdf5b

fix pbs file creation

281ec3f

Merge branch 'main' into job-rework

06238a7

add job update_info

8b5a489

add Job.update_status()

43e914d

add JobBuilder in public Dataset

5fdb947

fix batch job suffix

9553335

filter JobBuilder.submit_pbs() to prepared jobs only

49f8a74

add job module docstrings

bcde81e

add job file tests

36b9989

add job.submit() tests

1bee9f1

add job.update_info tests

5aa72be

add job.update_status test

5e1abad

add JobBuilder tests

f0967da

remove old job module

205f418

Merge branch 'main' into job-rework

f840e39

add public api job doc

dc9d156

Merge branch 'main' into job-rework

e437ab4

Merge branch 'main' into job-rework

3099dde

lint

d39a1a1

Merge branch 'main' into job-rework

7b79282

Gautzilla added 9 commits October 29, 2025 17:58

remove public dataset from analysis

491d9b2

rename import path

a4cb851

add parser factory method

2a9fcaa

add argparser tests

646d9d6

use new args in apis

02959e3

add export_analysis script tests

ed2a050

add fixture resetting config dicts

60f877a

add job error tests

c75b44e

add core API job doc

a474cda

Gautzilla changed the title ~~[DRAFT] Job rework~~ Job rework Oct 31, 2025

Gautzilla marked this pull request as ready for review October 31, 2025 10:08

Gautzilla added 2 commits November 3, 2025 11:24

Merge branch 'main' into job-rework

5914577

fix typo

bb9f7ec

mathieudpnt requested changes Nov 18, 2025

View reviewed changes

fix default value for export_analysis.py last argument

1b5cc87

mathieudpnt self-requested a review November 18, 2025 15:58

Merge branch 'main' into job-rework

7e4d8bb

mathieudpnt previously approved these changes Nov 18, 2025

View reviewed changes

mathieudpnt self-requested a review November 18, 2025 16:17

mathieudpnt approved these changes Nov 18, 2025

View reviewed changes

mathieudpnt merged commit 85f3c75 into Project-OSmOSE:main Nov 18, 2025
2 checks passed

Gautzilla deleted the job-rework branch November 18, 2025 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Job rework #281

Job rework #281

Uh oh!

Gautzilla commented Sep 22, 2025 •

edited

Loading

Uh oh!

mathieudpnt left a comment

Uh oh!

Gautzilla commented Nov 18, 2025

Uh oh!

mathieudpnt left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Job rework #281

Job rework #281

Uh oh!

Conversation

Gautzilla commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐳 What's new?

🐳 How to use it??

🐬 Public API

🐬 Core API

🐡 Outro

Uh oh!

mathieudpnt left a comment

Choose a reason for hiding this comment

Uh oh!

Gautzilla commented Nov 18, 2025

Uh oh!

mathieudpnt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Gautzilla commented Sep 22, 2025 •

edited

Loading