Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine pmp2sdp and sdpb into a single executable? #78

Open
3 tasks
Tracked by #96
vasdommes opened this issue Jun 28, 2023 · 3 comments
Open
3 tasks
Tracked by #96

Combine pmp2sdp and sdpb into a single executable? #78

vasdommes opened this issue Jun 28, 2023 · 3 comments
Assignees
Milestone

Comments

@vasdommes
Copy link
Collaborator

vasdommes commented Jun 28, 2023

Problem

Common use case:

  • User runs sdp2input, it writes sdp to disk
  • User run sdpb, it reads sdp from disk and runs solver
  • Intermediate sdp is never used elsewhere

These intermediate IO operations can be quite expensive (tens of minutes). In Skydive, sdp2input+sdpb are called for each iteration, and IO takes up to ~50% of total sdpb time (as noted by @suning-git).

Solution

Create a single executable that accepts input in different formats and performs in-memory conversion to the format accepted by solver.

Potential issues

See comments below for details.

  • sdp.zip is needed for SDPB restart. SDPB has to decide whether to overwrite sdp.zip or ignore PMP input, if both are present. May lead to subtle bugs if the user doesn't really understand SDPB behavior.
  • Redistributing PMP matrices and blocks among the cores may be non-trivial, with lots of MPI messaging. Do we get any significant speedup compared to our current simple strategy "write sdp, the read sdp"?
  • Possible to get too many open files error (each core writes something) - need to limit number number of open files during SDP write? By the way, we can get that same problem e.g. in SDPB in debug mode when writing profiling data.
@vasdommes
Copy link
Collaborator Author

vasdommes commented Jul 19, 2023

SDPB restart

sdp.zip is reused for SDPB restart, so it makes sense to write it anyway.

We can run SDPB with the following options:

sdpb --pmpPath=pmp.json --sdpPath=sdp.zip

Behavior:

  • If sdp.zip exists, then SDPB will read it and start the solver (ignoring PMP).
  • If sdp.zip doesn't exists, SDPB will read pmp.json, convert it to sdp, write the result to sdp.zip and then start the solver.

NB: This can be problematic in the following scenario:

sdpb --pmpPath=old_pmp.json --sdpPath=sdp.zip
sdpb --pmpPath=new_pmp.json --sdpPath=sdp.zip # User assumes that sdp.zip will be overwritten, but it isn't!

Possible solutions:

  • Explicit flag --overwriteSdp=true. It's bad because generally we want to restart SDPB with the same parameters, so it will convert PMP again on restart.
  • Somehow check that the PMP input is the same, i.e. calculate checksum on all input files (and store in sdp.zip/control.json?). It works, but adds extra complexity.

Thus, if we talk only about usability (and not about IO performance), maybe it's still better to keep two separate executables.

@vasdommes
Copy link
Collaborator Author

Distributing PMP matrices and SDP blocks

Speaking of performance, there is a problem with distributing the blocks among the cores.

Current behavior:

In sdp2input, each core stores and processes only some polynomial matrices, according to a simple rule matrix_index % num_procs == rank

In SDPB, we distribute blocks among cores according to block costs (which are read from timing data or estimated by block sizes)
https://github.com/davidsd/sdpb/blob/3019fcd7122794ddb9618de1adcd1d8439716031/src/sdp_solve/Block_Info/read_block_costs.cxx

Moreover, a single block can be stored as a DistMatrix for a group of cores, if procGranularity>1.

The problem

If we want to keep everything in-memory (without writing and reading sdp.zip), how do we switch from initial (PMP) block distribution to the final one?

If timing data is available, we can use it from the very beginning.
Potential problem: procGranularity. Probably we'll have to read the same matrix for each core in group, convert it (again for each core) and then store to DistMatrix. Or read it only with the first core, and then send to other cores.
Another problem: if the order of PMP files changes, then block indexing changes,

If there is no timing data, then we can read PMP matrices as we do now, and then perform some non-trivial MPI messaging to redistribute them. Maybe it will not be significantly faster than just writing to disk and reading again.
Probably we can look at PMP matrix sizes and calculate the corresponding block costs at the start?

Anyway, all this requires non-trivial code changes, and we should do it only if IO for sdp.zip is a real bottleneck.
(e.g. we probably don't want to fix this if generating and writing PMP in Mathematica is much slower)

@vasdommes vasdommes added this to the Backlog milestone Nov 14, 2023
@vasdommes vasdommes changed the title Combine sdp2input/pvm2sdp and sdpb into a single executable Combine pmp2sdp and sdpb into a single executable? Jan 28, 2024
@vasdommes
Copy link
Collaborator Author

After recent pmp2sdp input #150 and output #177 optimizations, IO performance should not be that much of a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant