Skip to content

Commit

Permalink
Merge pull request #5990 from tjhei/step-37-print-vec-info
Browse files Browse the repository at this point in the history
step-37: print vectorization info
  • Loading branch information
masterleinad committed Apr 22, 2018
2 parents c7a563f + 60c95a0 commit 8a2af55
Show file tree
Hide file tree
Showing 6 changed files with 74 additions and 1 deletion.
3 changes: 3 additions & 0 deletions doc/news/changes/minor/20180421Heister
@@ -0,0 +1,3 @@
New: Utilities::System::get_current_vectorization_level() returns the currently used vectorization support in string format.
<br>
(Timo Heister, 2018/04/21)
3 changes: 2 additions & 1 deletion examples/step-37/doc/intro.dox
Expand Up @@ -514,7 +514,8 @@ FEEvaluation for your computer, you should compile deal.II with the so-called
enabled by setting the variable <tt>CMAKE_CXX_FLAGS</tt> to
<tt>"-march=native"</tt> in the cmake build settings (on the command line,
specify <tt>-DCMAKE_CXX_FLAGS="-march=native"</tt>, see the deal.II README for
more information). Similar options exist for other compilers.
more information). Similar options exist for other compilers. We output
the current vectorization length in the run() function of this example.


<h3>Running multigrid on large-scale parallel computers</h3>
Expand Down
4 changes: 4 additions & 0 deletions examples/step-37/doc/results.dox
Expand Up @@ -13,6 +13,7 @@ Of more interest is to evaluate some aspects of the multigrid solver.
When we run this program in 2D for quadratic ($Q_2$) elements, we get the
following output (when run on one core in release mode):
@code
Vectorization over 2 doubles = 128 bits (SSE2), VECTORIZATION_LEVEL=1
Cycle 0
Number of degrees of freedom: 81
Total setup time (wall) 0.00159788s
Expand Down Expand Up @@ -67,6 +68,7 @@ use uniform mesh refinement, we get eight times as many elements and
approximately eight times as many degrees of freedom with each cycle:

@code
Vectorization over 2 doubles = 128 bits (SSE2), VECTORIZATION_LEVEL=1
Cycle 0
Number of degrees of freedom: 125
Total setup time (wall) 0.00231099s
Expand Down Expand Up @@ -105,6 +107,7 @@ degree_finite_element=4;</code> at the top of the program, we get the
following program output:

@code
Vectorization over 2 doubles = 128 bits (SSE2), VECTORIZATION_LEVEL=1
Cycle 0
Number of degrees of freedom: 729
Total setup time (wall) 0.00633097s
Expand Down Expand Up @@ -157,6 +160,7 @@ Finally, let us look at the timings with degree 8, which corresponds to
another round of mesh refinement in the lower order methods:

@code
Vectorization over 2 doubles = 128 bits (SSE2), VECTORIZATION_LEVEL=1
Cycle 0
Number of degrees of freedom: 4913
Total setup time (wall) 0.0842004s
Expand Down
13 changes: 13 additions & 0 deletions examples/step-37/step-37.cc
Expand Up @@ -1159,9 +1159,22 @@ namespace Step37
// The function that runs the program is very similar to the one in
// step-16. We do few refinement steps in 3D compared to 2D, but that's
// it.
//
// Before we run the program, we output some information about the detected
// vectorization level as discussed in the introduction.
template <int dim>
void LaplaceProblem<dim>::run ()
{
{
const unsigned int n_vect_doubles = VectorizedArray<double>::n_array_elements;
const unsigned int n_vect_bits = 8*sizeof(double)*n_vect_doubles;

pcout << "Vectorization over " << n_vect_doubles
<< " doubles = " << n_vect_bits << " bits ("
<< Utilities::System::get_current_vectorization_level()
<< "), VECTORIZATION_LEVEL=" << DEAL_II_COMPILER_VECTORIZATION_LEVEL << std::endl;
}

for (unsigned int cycle=0; cycle<9-dim; ++cycle)
{
pcout << "Cycle " << cycle << std::endl;
Expand Down
34 changes: 34 additions & 0 deletions include/deal.II/base/utilities.h
Expand Up @@ -621,6 +621,40 @@ namespace Utilities
*/
double get_cpu_load ();

/**
* Return the current level of vectorization as described by DEAL_II_COMPILER_VECTORIZATION_LEVEL
* in vectorization.h as a string. The list of possible return values is:
*
* <table>
* <tr>
* <td><tt>VECTORIZATION_LEVEL</tt></td>
* <td>Return Value</td>
* <td>Width in bits</td>
* </tr>
* <tr>
* <td>0</td>
* <td>disabled</td>
* <td>64</td>
* </tr>
* <tr>
* <td>1</td>
* <td>SSE2</td>
* <td>128</td>
* </tr>
* <tr>
* <td>2</td>
* <td>AVX</td>
* <td>256</td>
* </tr>
* <tr>
* <td>3</td>
* <td>AVX512</td>
* <td>512</td>
* </tr>
* </table>
*/
const std::string get_current_vectorization_level();

/**
* Structure that hold information about memory usage in kB. Used by
* get_memory_stats(). See man 5 proc entry /status for details.
Expand Down
18 changes: 18 additions & 0 deletions source/base/utilities.cc
Expand Up @@ -631,6 +631,24 @@ namespace Utilities

#endif

const std::string
get_current_vectorization_level()
{
switch (DEAL_II_COMPILER_VECTORIZATION_LEVEL)
{
case 0:
return "disabled";
case 1:
return "SSE2";
case 2:
return "AVX";
case 3:
return "AVX512";
default:
AssertThrow(false, ExcInternalError("Invalid DEAL_II_COMPILER_VECTORIZATION_LEVEL."));
return "ERROR";
}
}


void get_memory_stats (MemoryStats &stats)
Expand Down

0 comments on commit 8a2af55

Please sign in to comment.