Skip to content
This repository has been archived by the owner on Feb 2, 2024. It is now read-only.

MPI localization in one place #101

Merged
merged 7 commits into from
Aug 7, 2019
Merged

MPI localization in one place #101

merged 7 commits into from
Aug 7, 2019

Conversation

shssf
Copy link
Contributor

@shssf shssf commented Jul 29, 2019

Step 1.
No algorithms changed.
The code copied from one place to other without any modifications. It still have all issues, bugs, memory leaks and etc. Fixing that is the other PRs.

Exception: CSV module slightly changed (in C part) due to MPI related code intermixed with non-MPI. It needs to be addressed but I think it will be in other PR.

@fschlimb
Copy link
Contributor

While I support the idea of isolating transport/messaging technology, the choice of boundaries doesn't seem right. The PR creates a huge transport module which includes much more than just transport: file-read/write, communication (send/recv, collectives), join, shuffle and more.
It would be better to have a module which provides communication/transport functionality only and leave its use to the modules that define the functional semantics (e.g. keep csv-related stuff in csv module, keep quantile computation separate from communication etc).
I don't see how the suggested separation improves the code structure; I actually think it's making it worse.

@shssf
Copy link
Contributor Author

shssf commented Jul 30, 2019

The PR creates a huge transport module which includes much more than just transport:

Could you please provide an example?
If some small part of non-transport logic is in the transport module for this moment - it can be solved later. I didn't want to do everything in one PR. Currently it is just for simple code move to avoid mpi.h in many modules in HPAT.

@fschlimb
Copy link
Contributor

fschlimb commented Jul 30, 2019

Things like count_lines, hpat_mpi_csv_get_offsets, get_file_size, get_lower_upper_kth_parallel.

I simply believe the benefit of having less files including mpi.h does not justify impairing the code structure by any degree. A lot of code changes without apparent benefit.

In my mind, a better approach is to gradually move to an abstraction (like what 'hdist_*' was doing) everywhere. Once that's done, we can move it to a properly designed and isolated module. This will never make anything worse.

@shssf
Copy link
Contributor Author

shssf commented Jul 30, 2019

@fschlimb get_file_size is fully MPI dependent. Other functions looks the same.
All these functions will be implemented as C++ classes tree in my plan. Currently, I don't want to spend time on this. I need to provide ability for user to use HPAT without MPI library. For this, I need to know how many functions are and how they used in HPAT.
This PR localize MPI in HPAT and allow independent work on MPI transport without modification of many HPAT sources.

@fschlimb
Copy link
Contributor

You can just add a comment to the functions to count them. No need to move them around.

There is a difference between the functionality that MPI provides and functions that depend on that functionality. I like the idea of isolating the former, but I don't think we should bundle the latter.

@shssf
Copy link
Contributor Author

shssf commented Jul 30, 2019

You can just add a comment to the functions to count them

You will have no guarantee you find all functions and their successors.

I like the idea of isolating the former

I also like this idea. How to do this and what first step should be in this approach? How much time it takes (the task - remove MPI dependency)?

@shssf shssf merged commit 944e434 into master Aug 7, 2019
@shssf shssf deleted the mpi_localization_step_1 branch August 10, 2019 13:52
kozlov-alexey pushed a commit to kozlov-alexey/sdc that referenced this pull request Oct 4, 2019
* MPI localization in one place

* hdf5 mpi comment out

* remove MPI_LIB from chiframes module
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants