-
Notifications
You must be signed in to change notification settings - Fork 41
/
Changelog
256 lines (186 loc) · 10.2 KB
/
Changelog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
Changes in version 4.0.3
r13947; 2013-03-30 10:55:56 -0500 (Sat, 30 Mar 2013)
- Fixed various issues related to wavefront diffusion (flyspray #43).
- Fixed issue related to the serial code for adaptive repartitioning
(flyspray #95).
- Incorporated the latest version of Metis.
Changes in version 4.0.2
r10987; 2011-10-31 09:42:33 -0500 (Mon, 31 Oct 2011)
- Updated cmake files to use mpicc/mpicxx by default and remove
MPI auto-detection that has been creating problems.
- Fixed refinement assert failure for 0 degree vertices.
Changes in version 4.0.1
r10758; 2011-09-15 17:09:42 -0500 (Thu, 15 Sep 2011)
- Fixed issue with geometric partitioning and too few vertices.
- Fixed memory leak related to progress reporting.
Changes in version 4.0
r10658; 2011-08-03 09:38:45 -0500 (Wed, 03 Aug 2011)
- Switch to collective comm operations in geometric partitioning.
- Fixed issues with numflag==1 and npes==1.
- Added Visual studio support.
- More manual updates.
Changes in version 4.0rc1
r10592; 2011-07-16 16:17:53 -0500 (Sat, 16 Jul 2011)
- Improved the quality of the geometric partitioning routines.
- Removed the 4K limit on the maximum number of processors for
geometric partitioning.
- Fixed minor bugs that surfaced since 4.0a3.
- Updated the manual.
Changes in version 4.0a3
r10573; 2011-07-14 08:31:54 -0500 (Thu, 14 Jul 2011)
- Fixed an old and well-hidden bug in the core sparse communication
routines.
- Fixed the mesh partitioning routines, which were broken due to
the fact that ParMetis now tracks all memory allocations and
frees them at the end of the computations.
Changes in version 4.0a2
r10566; 2011-07-13 11:15:54 -0500 (Wed, 13 Jul 2011)
- Removed MAXNCON and MAX_PES constant dependency.
- Reduced memory requirements for MPI comm related data structures.
- Restructuring of the parameter-checking part of the code.
- Rewrote how ctrl/graph are being setup. Cleaner code; fewer bugs.
- Fixed some bugs identified by the early testers.
Changes in version 4.0a1
- Serial parts of the code are now based on Metis 5.0.
- Complete 64 bit support that is controlled at build time by setting
the width of the idx_t type in metis/include/metis.h
- Re-wrote the memory management subsystem that ParMetis utilizes to
reduce the total amount of memory that it uses and to support graceful
exits (to be implemented in the final 4.0).
- Better support for multi-constraint partitioning with per-constraint
unbalance tolerances.
- Fixed various bugs that were there since 3.0.
Changes in version 3.2
- Added a new ordering code that incorporates two major improvements
in the refinement routines that should give it a performance that is
comparable to that of serial Metis. In addition, the new ordering
routines eliminate the power of two restriction of the old routines.
- Added a new API function ParMETIS_V32_NodeND that exposes the new
ordering options to the user. The old API function is still valid
and utilizes the new API.
- Added a logic to switch to ParMETIS_V3_PartKway when the
ParMETIS_V3_PartGeomKway is called with more than 4096 processors.
This is due to a current limitation of the ParMETIS_V3_PartGeomKway
for large number of processors (i.e., it uses too much memory).
- Fixed various compilation warnings due to the latest glibc version.
- Better handling of island (multi-)vertices.
- Fixed a number of reported bugs. The following tasks correspond
to the issues reported at http://glaros.dtc.umn.edu/flyspray
- Flyspray Task 55: Fixed segfault when graph->nvtxs == 0.
- Flyspray Task 54: The above fix applies here as well.
- Flyspray Task 53: Implemented a partial fix. Complete fix in 4.0.
- Flyspray Task 50: Free-memory write in PartGeomKway.
- Flyspray Task 38: Removed malloc.h from stdheaders.h
Changes in version 3.1.1
- Fixed a number of bugs that have been reported over the years.
The following tasks correspond to the issues reported at
http://glaros.dtc.umn.edu/flyspray
- Flyspray Task 8: Fixed deallocation of user-supplied vsize
- Flyspray Task 28: Fixed ParMETIS_V3_Mesh2Dual static arrays
- Flyspray Task 30: Fixed 1025 instead of 1024 buckets
- Flyspray Task 34: Fixed writting past wspace->core for certain cases
- Flyspray Task 35: Fixed issues associated with assumed 0-based indexing
- Flyspray Task 36: Fixed mesh 1->0 numbering error
- Fixed non-utilization of the user-supplied seed for the parallel ordering
code
Changes in version 3.1
- The mesh partitioning and dual creation routines have changed to support mixed
element meshes.
- The parmetis.h header file has been restructured and is now C++ friendly.
- Fortran bindings/renamings for various routines have been added.
- A number of bugs have been fixed.
- tpwgts are now respected for small graphs.
- fixed various divide by zero errors.
- removed dependency on the old drand48() routines.
- fixed some memory leaks.
Changes in version 3.0
- The names and calling sequence of all the routines have changed due to expanded
functionality that has been provided in this release. However, the 2.0 API calls
have been mapped to the new routines. However, the expanded functionality provided
with this release is only available by using the new calling sequences.
- The four adaptive repartitioning routines:
ParMETIS_RepartLDiffusion,
ParMETIS_RepartGDiffusion,
ParMETIS_RepartRemap, and
ParMETIS_RepartMLRemap,
have been replaced by a single routine called ParMETIS_V3_AdpativeRepart that
implements a unified repartitioning algorithm which combines the best features
of the previous routines.
- Multiple vertex weights/balance constraints are supported for most of the
routines. This allows ParMETIS to be used to partition graphs for multi-phase
and multi-physics simulations.
- In order to optimize partitionings for specific heterogeneous computing
architectures, it is now possible to specify the target sub-domain weights
for each of the sub-domains and for each balance constraint. This feature,
for example, allows the user to compute a partitioning in which one of the
sub-domains is twice the size of all of the others.
- The number of sub-domains has been de-coupled from the number of processors
in both the static and the adaptive partitioning schemes. Hence, it is now
possible to use the parallel partitioning and repartitioning algorithms
to compute a k-way partitioning independent of the number of processors
that are used. Note that Version 2.0 provided this functionality for the
static partitioning schemes only.
- Routines are provided for both directly partitioning a finite element mesh,
and for constructing the dual graph of a mesh in parallel.
Changes in version 2.0
- Changed the names and calling sequences of all the routines to make it
easier to use ParMETIS with Fortran.
- Improved the performance of the diffusive adaptive repartitioning
algorithms.
- Added a new set of adaptive repartitioning routines that are based on the
remapping paradigm. These routines are called ParMETIS_RepartRemap and
ParMETIS_RepartMLRemap
- The number of partitions has been de-coupled from the number of processors.
You can now use the parallel partitioning algorithms to compute a k-way
partitioning independent of the number of processors that you use.
- The partitioning and ordering algorithms in ParMETIS now utilize various
portions of the serial METIS library. As a result of this, the quality
of the produced partitionings and orderings have been improved.
Remember to link your code with both libmetis.a and libparmetis.a
Changes in version 1.0
- Added partitioning routines that take advantage of coordinate information.
These routines are based on space-filling curves and they are used to
quickly compute a initial distribution for PARKMETIS.
A total of three routines have been added called PARGKMETIS, PARGRMETIS,
and PARGMETIS
- Added a fill-reducing ordering routine that is based on multilevel nested
dissection. This is similar to the ordering routine in the serial Metis
with the difference that is directly computes and refines vertex
separators. The new routine is called PAROMETIS and returns the new ordering
of the local nodes plus a vector describing the sizes of the various
separators that form the elimination tree.
- Changed the calling sequence again! I found it awkward to require that
communicators and other scalar quantities being passed by reference.
- Fixed a number of memory leaks.
Changes in version 0.3
- Incorporated parallel multilevel diffusion algorithms for repartitioning
adaptively refined meshes. Two routines have been added for this purpose:
PARUAMETIS that performs undirected multilevel diffusion
PARDAMETIS that performs directed multilevel diffusion
- Changed the names and calling sequences of the parallel partitioning
and refinement algorithms. Now they are called PARKMETIS for the
k-way partitioning and PARRMETIS for the k-way refinement.
Also the calling sequence has been changed slightly to make ParMETIS
Fortran callable.
- Added an additional option for selecting the algorithm for initial
partitioning at the coarsest graph. Now you have the choice of selecting
either a serial or a parallel algorithm. The parallel initial partitioning
speeds up the algorithm especially for large number of processors.
NOTE that the parallel initial partitioning works only for partitions that
are power of two. If you want partitions that are not power of two you must
use the old serial initial partitioning option.
- Fixed some bugs in the initial partitioning code.
- Made parallel k-way refinement more robust by randomly ordering the
processors at each phase
Changes in version 0.2
- A complete reworking of the primary algorithms. The performance
of the code has improved considerably. Over 30% on 128 processor
Cray T3D. Improvement should be higher on machines with high
latencies.
Here are some performance numbers on T3D using Cray's MPI
for 2 graphs, mdual (0.25M vertices) and mdual2 (1.0M vertices)
16PEs 32PEs 64PEs 128PEs
mdual 4.07 2.97 2.82
mdual2 15.02 8.89 6.12 5.75
- The quality of the produced partitions has been improved.
- Added options[2] to specify C or Fortran style numbering.