## PENNANT  
Time is spent in various places.  QCS.cc (33.5%(E)), Vec2.hh (31.8%(E)) and Mesh.cc (15.8%(E)).  
  
**Vec2.hh Contains the double2 struct with overloaded operators and other functions. **   
**Everything is inlined, so the time attributed to Vec2.hh is shorting the times in other areas.**
  
#### Serial Run  
./pennant ../test/leblancbig/leblancbig.pnt

### QCS.cc | setCornerDiv( ) | Loop at line 137

```c
    // [2] Divergence at the corner   
    #pragma ivdep                                           // 21.4% CPUTIME(E)
    for (int c = cfirst; c < clast; ++c) {        
        int s2 = c;
        int s = mesh->mapss3[s2];
        // Associated zone, corner, point
        int z = mesh->mapsz[s];
        int z0 = z - zfirst;
        int c0 = c - cfirst;
        int p = mesh->mapsp2[s];
        // Points
        int p1 = mesh->mapsp1[s];
        int p2 = mesh->mapsp2[s2];
        // Edges
        int e1 = mesh->mapse[s];
        int e2 = mesh->mapse[s2];

        // Velocities and positions
        // 0 = point p
        up0 = pu[p];
        xp0 = px[p];
        // 1 = edge e2
        up1 = 0.5 * (pu[p] + pu[p2]);
        xp1 = ex[e2];
        // 2 = zone center z
        up2 = z0uc[z0];
        xp2 = zx[z];
        // 3 = edge e1
        up3 = 0.5 * (pu[p1] + pu[p]);
        xp3 = ex[e1];

        // compute 2d cartesian volume of corner
        double cvolume = 0.5 * cross(xp2 - xp0, xp3 - xp1);
        c0area[c0] = cvolume;

        // compute cosine angle
        double2 v1 = xp3 - xp0;
        double2 v2 = xp1 - xp0;
        double de1 = elen[e1];
        double de2 = elen[e2];
        double minelen = min(de1, de2);
        c0cos[c0] = ((minelen < 1.e-12) ?
                0. :
                4. * dot(v1, v2) / (de1 * de2));

        // compute divergence of corner
        c0div[c0] = (cross(up2 - up0, xp3 - xp1) -
                cross(up3 - up1, xp2 - xp0)) /
                (2.0 * cvolume);

        // compute evolution factor
        double2 dxx1 = 0.5 * (xp1 + xp2 - xp0 - xp3);
        double2 dxx2 = 0.5 * (xp2 + xp3 - xp0 - xp1);
        double dx1 = length(dxx1);
        double dx2 = length(dxx2);

        // average corner-centered velocity
        double2 duav = 0.25 * (up0 + up1 + up2 + up3);

        double test1 = abs(dot(dxx1, duav) * dx2);
        double test2 = abs(dot(dxx2, duav) * dx1);
        double num = (test1 > test2 ? dx1 : dx2);
        double den = (test1 > test2 ? dx2 : dx1);
        double r = num / den;
|199|   double evol = sqrt(4.0 * cvolume * r);                   // 10.8% CPUTIME(E)
        evol = min(evol, 2.0 * minelen);

        // compute delta velocity
        double dv1 = length2(up1 + up2 - up0 - up3);
        double dv2 = length2(up2 + up3 - up0 - up1);
        double du = sqrt(max(dv1, dv2));

        c0evol[c0] = (c0div[c0] < 0.0 ? evol : 0.);
        c0du[c0]   = (c0div[c0] < 0.0 ? du   : 0.);
    }  // for s
```

---

### Mesh.cc | calcVols( ) 
```c
void Mesh::calcVols(                                              // 15.8% CPUTIME(E)
        const double2* px,
        const double2* zx,
        double* sarea,
        double* svol,
        double* zarea,
        double* zvol,
        const int sfirst,
        const int slast) {

    int zfirst = mapsz[sfirst];
    int zlast = (slast < nums ? mapsz[slast] : numz);
    fill(&zvol[zfirst], &zvol[zlast], 0.);
    fill(&zarea[zfirst], &zarea[zlast], 0.);

    const double third = 1. / 3.;
    int count = 0;
    for (int s = sfirst; s < slast; ++s) {                        // 6.3% CPUTIME(E)
        int p1 = mapsp1[s];
        int p2 = mapsp2[s];
        int z = mapsz[s];

        // compute side volumes, sum to zone
        double sa = 0.5 * cross(px[p2] - px[p1], zx[z] - px[p1]);
        double sv = third * sa * (px[p1].x + px[p2].x + zx[z].x);
        sarea[s] = sa;
        svol[s] = sv;
        zarea[z] += sa;
        zvol[z] += sv;

        // check for negative side volumes
        if (sv <= 0.) count += 1;

    } // for s

    if (count > 0) {
        #pragma omp atomic
        numsbad += count;
    }

}
```

#### CPI
1.01 Cycles per Instruction Program Aggregate.  
1.37 Cycles per Instruction at `QCS::setCornerDiv() | loop at line 137`.   
0.77 Cycles per Instruction at `Mesh::calcVols()`.   
0.80 Cycles per Instruction at `Vec2.hh`.  
#### Issue Cycles  
Program Aggregate:  
-- 2.58e+11 Full Issue | 17.7% Cycles Issuing Max Instructions   
-- 4.36e+08 No Issue | less than .1% Cycles Issuing No Instructions   
-- 1.46e+12 Total Cycles

`QCS::setCornerDiv() | loop at 137`:  
-- 2.64e+10 Full Issue | 8.0% Cycles Issuing Max Instructions  
-- 3.80e+07 No Issue | less than .1% Cycles Issuing No Instructions  
-- 3.29e+11 Total Cycles  
  
`Mesh::calcVols`:    
-- 3.72e+09 Full Issue | 4.5% Cycles Issuing Max Instructions  
-- 1.20e+07 No Issue | less than .1% Cycles Issuing No Instructions  
-- 8.34e+10 Total Cycles  
  
`Vec2.hh`:  
-- 8.21e+10 Full Issue | 17.8% Cycles Issuing Max Instructions  
-- 1.48e+08 No Issue | less than .1% Cycles Issuing No Instructions  
-- 4.61e+11 Total Cycles  
#### Retiring Cycles  
Program Aggregate:  
-- 6.21e+11 Full Retire | 42.5% Cycles Retiring Max Instructions   
-- 3.32e+11 No Retire | 22.7% Cycles Retiring No Instructions   
-- 1.46e+12 Total Cycles

`QCS::setCornerDiv() | loop at 137`:  
-- 1.15e+11 Full Retire | **35.0%** Cycles Retiring Max Instructions  
-- 9.75e+10 No Retire | 29.6% Cycles Retiring No Instructions  
-- 3.29e+11 Total Cycles  
  
`Mesh::calcVols`:    
-- 4.78e+10 Full Retire | **57.3%** Cycles Retiring Max Instructions  
-- 1.12e+10 No Retire | 13.4% Cycles Retiring No Instructions  
-- 8.34e+10 Total Cycles  
  
`Vec2.hh`:  
-- 2.36e+11 Full Retire | 51.2% Cycles Retiring Max Instructions  
-- 8.90e+10 No Retire | 19.3% Cycles Retiring No Instructions  
-- 4.61e+11 Total Cycles

### Memory
#### Data Cache 
Program Aggregate:  
-- 5.50e+10 L1 Data Cache Misses |  5.5% L1 Cache Miss Rate  
-- 8.83e+09 L2 Data Cache Misses | 83.9% L1 Misses Hit L2  
-- 9.94e+11 Load/Store Instructions  
`QCS::setCornerDiv() | loop at 137`:  
-- 5.68e+09 L1 Data Cache Misses |  4.6% L1 Cache Miss Rate  
-- 7.42e+08 L2 Data Cache Misses | 83.9% L1 Misses Hit L2  
-- 1.23e+11 Load/Store Instructions  
`Mesh::calcVols`:    
-- 2.13e+09 L1 Data Cache Misses |  **3.3%** L1 Cache Miss Rate  
-- 4.21e+08 L2 Data Cache Misses | 80.2% L1 Misses Hit L2  
-- 6.36e+10 Load/Store Instructions  
`Vec2.hh`:  
-- 2.11e+10 L1 Data Cache Misses |  5.3% L1 Cache Miss Rate   
-- 3.56e+09 L2 Data Cache Misses | 83.1% L1 Misses Hit L2  
-- 3.95e+11 Load Store Instructions