## RSBench

Compiler: gcc 4.8.5  
Flags: `-std=gnu99 -fopenmp -ffast-math -g -Ofast -DSTATUS`  
Libs: `-lm`   
On Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz (Broadwell)



---

### CPU Time
`rsbench.gnu` -t 1 (serial mode)  
  
Inclusively RSBench spends 97.8% of its time in `xs_kernel.c`.   
Exclusively only 47.1% of total time is spent in `xs_kernel.c`.   
The other time is spent in the math library. libm-2.17.so  
  


|xs_kernel.c|CPU Inclusive|CPU Exclusive|
|:----------|:-----------:|:-----------:|
|calculate_micro_xs_doppler|96.3%|27.2%|
|---> loop at line 181|39.4%|22.5%|
|calculate_sig_T|52.2%|2.9%|
|fast_nuclear_W|16.9%|15.5%|
|---> line 72|11.1%|11.1%|

#### Loop at line 181 (22.5% Exclusive CPU Time)
```c
	// Loop over Poles within window, add contributions
	for( int i = w.start; i < w.end; i++ )
	{
		Pole pole = data.poles[nuc][i];

		// Prep Z
		double complex Z = (E - pole.MP_EA) * dopp;
		if( cabs(Z) < 6.0 )
			(*abrarov)++;
		(*alls)++;

		// Evaluate Fadeeva Function
		complex double faddeeva = fast_nuclear_W( Z );

		// Update W
		sigT += creal( pole.MP_RT * faddeeva * sigTfactors[pole.l_value] );
		sigA += creal( pole.MP_RA * faddeeva);
		sigF += creal( pole.MP_RF * faddeeva);
	}
```

#### calculate_sig_T (Where most of the libm time comes from 49.7% Total CPU Time)
```c
void calculate_sig_T( int nuc, double E, Input input, CalcDataPtrs data, complex double * sigTfactors )
{
	double phi;

	for( int i = 0; i < input.numL; i++ )
	{
		phi = data.pseudo_K0RS[nuc][i] * sqrt(E);

		if( i == 1 )
			phi -= - atan( phi );
		else if( i == 2 )
			phi -= atan( 3.0 * phi / (3.0 - phi*phi));
		else if( i == 3 )
			phi -= atan(phi*(15.0-phi*phi)/(15.0-6.0*phi*phi));

		phi *= 2.0;

		sigTfactors[i] = cos(phi) - sin(phi) * _Complex_I;
	}
}
```

#### fast_nuclear_W
"This function uses a combination of the Abrarov Approximation
and the QUICK_W three term asymptotic expansion.
Only expected to use Abrarov ~0.5% of the time."
  
Function defines several hard coded values and spends most time on line 72 (11.1% CPU Time):
```c
double complex W = I * Z * (a/(Z*Z - b) + c/(Z*Z - d));
```

---

### L1 Data Cache Hit Rate
|L1 Data Cache Hit Rate|%|
|:----|:---:|
|calculate_micro_xs_doppler|41%|
|calculate_sig_T|84%|
|fast_nuclear_W|42%|