prototype of udp based job manager #1

jdries · 2024-06-28T08:26:00Z

Proposal for a UDP based variant of the job manager, created as part of APEx upscaling service:
https://jdries-vito.quarto.pub/apex-design/upscaling.html

Related issue is to support output to geoparquet: Open-EO/openeo-gfmap#107

The currently used csv format is limited in the sense that complex parameter types fail to deserialize correctly, requiring custom handling in this class. GeoParquet might improve this:
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#json

soxofaan

a couple of notes

If I understand correctly, this PR adds two separate features to the existing job manager:

producing jobs from a fixed (but parameterized) UDP and a user-provided dataframe of parameter
Running the job manager in a thread

These features seems to be totally unrelated, so I wonder if they can't be separated.

For example:

the producing of the jobs could be a factory for a standard job manager
the threaded running could a method on the standard job manager

soxofaan · 2024-07-02T12:03:30Z

src/esa_apex_toolbox/upscaling/udp_job_manager.py

+        if self.dataframe is None:
+            self.dataframe = jobs_dataframe
+        else:
+            raise ValueError("Jobs already added to the job manager.")


this if else raise pattern looks like this could have been a constructor argument

soxofaan · 2024-07-02T12:09:30Z

src/esa_apex_toolbox/upscaling/udp_job_manager.py

+                          p.get("schema", {}).get("subtype", "") == "geojson"]
+
+
+        output_file = Path("jobs.csv")


This static file reference should be an argument I guess

soxofaan · 2024-07-02T12:15:38Z

src/esa_apex_toolbox/upscaling/udp_job_manager.py

+
+            cube = connection.datacube_from_process(row.udp_id,row.udp_namespace, **parameters)
+
+            title = row.get("title", f"Subjob {row.udp_id} - {str(parameters)}")


use row index instead of str(parameters) in title to avoid extremely large titles?

soxofaan · 2024-07-02T12:15:58Z

src/esa_apex_toolbox/upscaling/udp_job_manager.py

+
+
+
+        import multiprocessing, time


these imports can be toplevel I think

soxofaan · 2024-10-14T15:36:14Z

Because of various changes to the "official" MultiBackendJobManager from the client (e.g. built-in theaded run_jobs, and new job db initialization features), I think this PR is dead end, and better be closed.
However, it served as inspiration to implement UDP based job management in the python client itself:

initial PoC implementation of UDPJobFactory Open-EO/openeo-python-client#644

soxofaan · 2024-10-16T13:13:37Z

just merged Open-EO/openeo-python-client#644

prototype of udp based job manager

bca93e7

soxofaan reviewed Jul 2, 2024

View reviewed changes

jdries mentioned this pull request Aug 19, 2024

UDP based job manager Open-EO/openeo-python-client#604

Closed

soxofaan mentioned this pull request Oct 14, 2024

initial PoC implementation of UDPJobFactory Open-EO/openeo-python-client#644

Merged

soxofaan closed this Oct 14, 2024

jdries deleted the udp_job_manager branch October 16, 2024 08:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

prototype of udp based job manager #1

prototype of udp based job manager #1

Uh oh!

jdries commented Jun 28, 2024 •

edited

Loading

Uh oh!

soxofaan left a comment

Uh oh!

soxofaan Jul 2, 2024

Uh oh!

soxofaan Jul 2, 2024

Uh oh!

soxofaan Jul 2, 2024

Uh oh!

soxofaan Jul 2, 2024

Uh oh!

soxofaan commented Oct 14, 2024

Uh oh!

soxofaan commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		p.get("schema", {}).get("subtype", "") == "geojson"]


		output_file = Path("jobs.csv")


		cube = connection.datacube_from_process(row.udp_id,row.udp_namespace, **parameters)

		title = row.get("title", f"Subjob {row.udp_id} - {str(parameters)}")

prototype of udp based job manager #1

prototype of udp based job manager #1

Uh oh!

Conversation

jdries commented Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soxofaan left a comment

Choose a reason for hiding this comment

Uh oh!

soxofaan Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

soxofaan Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

soxofaan Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

soxofaan Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

soxofaan commented Oct 14, 2024

Uh oh!

soxofaan commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jdries commented Jun 28, 2024 •

edited

Loading