Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build errors in MSVC 2015 #99

Closed
DThiebaud opened this issue Jul 1, 2016 · 11 comments
Closed

Build errors in MSVC 2015 #99

DThiebaud opened this issue Jul 1, 2016 · 11 comments

Comments

@DThiebaud
Copy link

3>E:\src\re2\re2\dfa.cc(104): warning C4200: nonstandard extension used: zero-sized array in struct/union
3> E:\src\re2\re2\dfa.cc(104): note: Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array
3>E:\src\re2\re2\dfa.cc(729): error C2466: cannot allocate an array of constant size 0

3>E:\src\re2\re2\onepass.cc(134): warning C4200: nonstandard extension used: zero-sized array in struct/union
3> E:\src\re2\re2\onepass.cc(134): note: Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array

Setting the zero-size array to one compiles cleanly but may not be the desired way to handle this.

@junyer
Copy link
Contributor

junyer commented Jul 1, 2016

D'oh! I checked MSDN (https://msdn.microsoft.com/en-us/library/b6fae073.aspx) before I made that change in order to have some assurance that it would work with MSVC, but there must be more to it. :(

@junyer
Copy link
Contributor

junyer commented Jul 1, 2016

Could you please try adding #pragma warning(disable: 4200) to dfa.cc and onepass.cc? I'm hoping that will address error C2466 at the same time.

@DThiebaud
Copy link
Author

I'll try it, but remember that dfa.cc got an error (not a warning) on line 729 and I doubt if suppressing the warning will suppress the error.

@junyer
Copy link
Contributor

junyer commented Jul 2, 2016

If MSVC still has a problem with the initialisation, maybe try just declaring state and then assigning the fields individually? (Although I would prefer to keep line 729 as it is...)

@DThiebaud
Copy link
Author

On 7/2/2016 6:10 AM, Paul Wankadia wrote:

If MSVC still has a problem with the initialisation, maybe try just
declaring |state| and then assigning the fields individually?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#99 (comment), or
mute the thread
https://github.com/notifications/unsubscribe/ABpGnqzk-28RA5KMA5WJIZG6ziBZ3G1Pks5qRjkggaJpZM4JC_r4.

I'll try it. What does declaring "state" do? I'm not familiar with it.
Note I'm busy this weekend and may not get to this until Tuesday.

@junyer
Copy link
Contributor

junyer commented Jul 3, 2016

This is line 729:

  State state = {inst, ninst, flag};

If it turns out that disabling the warning isn't enough, then you could perhaps try this instead:

  State state;
  state.inst_ = inst;
  state.ninst_ = ninst;
  state.flag_ = flag;

That would reveal whether it's just the form of the initialisation that's upsetting MSVC.

@DThiebaud
Copy link
Author

On 7/3/2016 3:06 AM, Paul Wankadia wrote:

This is line 729:

|State state = {inst, ninst, flag}; |

If it turns out that disabling the warning isn't enough, then you
could perhaps try this instead:

|State state; state.inst_ = inst; state.ninst_ = ninst; state.flag_ =
flag; |

That would reveal whether it's just the form of the initialisation
that's upsetting MSVC.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#99 (comment), or
mute the thread
https://github.com/notifications/unsubscribe/ABpGnkHtnf0fVunYubJtlOwdEQemi3jWks5qR19ygaJpZM4JC_r4.

With this change, and with suppressing the warnings, re2 builds with no
warnings and no errors. The modified files dfa.cc and onepass.cc are
attached.

// Copyright 2008 The RE2 Authors. All Rights Reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

// Tested by search_test.cc.
//
// Prog::SearchOnePass is an efficient implementation of
// regular expression search with submatch tracking for
// what I call "one-pass regular expressions". (An alternate
// name might be "backtracking-free regular expressions".)
//
// One-pass regular expressions have the property that
// at each input byte during an anchored match, there may be
// multiple alternatives but only one can proceed for any
// given input byte.
//
// For example, the regexp /x_yx_/ is one-pass: you read
// x's until a y, then you read the y, then you keep reading x's.
// At no point do you have to guess what to do or back up
// and try a different guess.
//
// On the other hand, /x_x/ is not one-pass: when you're
// looking at an input "x", it's not clear whether you should
// use it to extend the x_ or as the final x.
//
// More examples: /([^ ]) (.)/ is one-pass; /(.) (.)/ is not.
// /(\d+)-(\d+)/ is one-pass; /(\d+).(\d+)/ is not.
//
// A simple intuition for identifying one-pass regular expressions
// is that it's always immediately obvious when a repetition ends.
// It must also be immediately obvious which branch of an | to take:
//
// /x(y|z)/ is one-pass, but /(xy|xz)/ is not.
//
// The NFA-based search in nfa.cc does some bookkeeping to
// avoid the need for backtracking and its associated exponential blowup.
// But if we have a one-pass regular expression, there is no
// possibility of backtracking, so there is no need for the
// extra bookkeeping. Hence, this code.
//
// On a one-pass regular expression, the NFA code in nfa.cc
// runs at about 1/20 of the backtracking-based PCRE speed.
// In contrast, the code in this file runs at about the same
// speed as PCRE.
//
// One-pass regular expressions get used a lot when RE is
// used for parsing simple strings, so it pays off to
// notice them and handle them efficiently.
//
// See also Anne Brüggemann-Klein and Derick Wood,
// "One-unambiguous regular languages", Information and Computation 142(2).

#include <string.h>
#include
#include "util/util.h"
#include "util/sparse_set.h"
#include "re2/prog.h"
#include "re2/stringpiece.h"

namespace re2 {

static const int Debug = 0;

// The key insight behind this implementation is that the
// non-determinism in an NFA for a one-pass regular expression
// is contained. To explain what that means, first a
// refresher about what regular expression programs look like
// and how the usual NFA execution runs.
//
// In a regular expression program, only the kInstByteRange
// instruction processes an input byte c and moves on to the
// next byte in the string (it does so if c is in the given range).
// The kInstByteRange instructions correspond to literal characters
// and character classes in the regular expression.
//
// The kInstAlt instructions are used as wiring to connect the
// kInstByteRange instructions together in interesting ways when
// implementing | + and *.
// The kInstAlt instruction forks execution, like a goto that
// jumps to ip->out() and ip->out1() in parallel. Each of the
// resulting computation paths is called a thread.
//
// The other instructions -- kInstEmptyWidth, kInstMatch, kInstCapture --
// are interesting in their own right but like kInstAlt they don't
// advance the input pointer. Only kInstByteRange does.
//
// The automaton execution in nfa.cc runs all the possible
// threads of execution in lock-step over the input. To process
// a particular byte, each thread gets run until it either dies
// or finds a kInstByteRange instruction matching the byte.
// If the latter happens, the thread stops just past the
// kInstByteRange instruction (at ip->out()) and waits for
// the other threads to finish processing the input byte.
// Then, once all the threads have processed that input byte,
// the whole process repeats. The kInstAlt state instruction
// might create new threads during input processing, but no
// matter what, all the threads stop after a kInstByteRange
// and wait for the other threads to "catch up".
// Running in lock step like this ensures that the NFA reads
// the input string only once.
//
// Each thread maintains its own set of capture registers
// (the string positions at which it executed the kInstCapture
// instructions corresponding to capturing parentheses in the
// regular expression). Repeated copying of the capture registers
// is the main performance bottleneck in the NFA implementation.
//
// A regular expression program is "one-pass" if, no matter what
// the input string, there is only one thread that makes it
// past a kInstByteRange instruction at each input byte. This means
// that there is in some sense only one active thread throughout
// the execution. Other threads might be created during the
// processing of an input byte, but they are ephemeral: only one
// thread is left to start processing the next input byte.
// This is what I meant above when I said the non-determinism
// was "contained".
//
// To execute a one-pass regular expression program, we can build
// a DFA (no non-determinism) that has at most as many states as
// the NFA (compare this to the possibly exponential number of states
// in the general case). Each state records, for each possible
// input byte, the next state along with the conditions required
// before entering that state -- empty-width flags that must be true
// and capture operations that must be performed. It also records
// whether a set of conditions required to finish a match at that
// point in the input rather than process the next byte.

// A state in the one-pass NFA - just an array of actions indexed
// by the bytemap_[] of the next input byte. (The bytemap
// maps next input bytes into equivalence classes, to reduce
// the memory footprint.)
struct OneState {
uint32 matchcond; // conditions to match right now.
#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable: 4200)
#endif
uint32 action[];
#ifdef _MSC_VER
#pragma warning(pop)
#endif
};

// The uint32 conditions in the action are a combination of
// condition and capture bits and the next state. The bottom 16 bits
// are the condition and capture bits, and the top 16 are the index of
// the next state.
//
// Bits 0-5 are the empty-width flags from prog.h.
// Bit 6 is kMatchWins, which means the match takes
// priority over moving to next in a first-match search.
// The remaining bits mark capture registers that should
// be set to the current input position. The capture bits
// start at index 2, since the search loop can take care of
// cap[0], cap[1](the overall match position).
// That means we can handle up to 5 capturing parens: $1 through $4, plus $0.
// No input position can satisfy both kEmptyWordBoundary
// and kEmptyNonWordBoundary, so we can use that as a sentinel
// instead of needing an extra bit.

static const int kIndexShift = 16; // number of bits below index
static const int kEmptyShift = 6; // number of empty flags in prog.h
static const int kRealCapShift = kEmptyShift + 1;
static const int kRealMaxCap = (kIndexShift - kRealCapShift) / 2 * 2;

// Parameters used to skip over cap[0], cap[1].
static const int kCapShift = kRealCapShift - 2;
static const int kMaxCap = kRealMaxCap + 2;

static const uint32 kMatchWins = 1 << kEmptyShift;
static const uint32 kCapMask = ((1 << kRealMaxCap) - 1) << kRealCapShift;

static const uint32 kImpossible = kEmptyWordBoundary | kEmptyNonWordBoundary;

// Check, at compile time, that prog.h agrees with math above.
// This function is never called.
void OnePass_Checks() {
COMPILE_ASSERT((1<<kEmptyShift)-1 == kEmptyAllFlags,
kEmptyShift_disagrees_with_kEmptyAllFlags);
// kMaxCap counts pointers, kMaxOnePassCapture counts pairs.
COMPILE_ASSERT(kMaxCap == Prog::kMaxOnePassCapture*2,
kMaxCap_disagrees_with_kMaxOnePassCapture);
}

static bool Satisfy(uint32 cond, const StringPiece& context, const char* p) {
uint32 satisfied = Prog::EmptyFlags(context, p);
if (cond & kEmptyAllFlags & ~satisfied)
return false;
return true;
}

// Apply the capture bits in cond, saving p to the appropriate
// locations in cap[].
static void ApplyCaptures(uint32 cond, const char* p,
const char** cap, int ncap) {
for (int i = 2; i < ncap; i++)
if (cond & (1 << kCapShift << i))
cap[i] = p;
}

// Compute a node pointer.
// Basically (OneState_)(nodes + statesize_nodeindex)
// but the version with the C++ casts overflows 80 characters (and is ugly).
static inline OneState* IndexToNode(volatile uint8* nodes, int statesize,
int nodeindex) {
return reinterpret_cast<OneState*>(
const_cast<uint8*>(nodes + statesize*nodeindex));
}

bool Prog::SearchOnePass(const StringPiece& text,
const StringPiece& const_context,
Anchor anchor, MatchKind kind,
StringPiece* match, int nmatch) {
if (anchor != kAnchored && kind != kFullMatch) {
LOG(DFATAL) << "Cannot use SearchOnePass for unanchored matches.";
return false;
}

// Make sure we have at least cap[1],
// because we use it to tell if we matched.
int ncap = 2*nmatch;
if (ncap < 2)
ncap = 2;

const char* cap[kMaxCap];
for (int i = 0; i < ncap; i++)
cap[i] = NULL;

const char* matchcap[kMaxCap];
for (int i = 0; i < ncap; i++)
matchcap[i] = NULL;

StringPiece context = const_context;
if (context.begin() == NULL)
context = text;
if (anchor_start() && context.begin() != text.begin())
return false;
if (anchor_end() && context.end() != text.end())
return false;
if (anchor_end())
kind = kFullMatch;

// State and act are marked volatile to
// keep the compiler from re-ordering the
// memory accesses walking over the NFA.
// This is worth about 5%.
volatile OneState* state = onepass_start_;
volatile uint8* nodes = onepass_nodes_;
volatile uint32 statesize = onepass_statesize_;
uint8* bytemap = bytemap_;
const char* bp = text.begin();
const char* ep = text.end();
const char* p;
bool matched = false;
matchcap[0] = bp;
cap[0] = bp;
uint32 nextmatchcond = state->matchcond;
for (p = bp; p < ep; p++) {
int c = bytemap[*p & 0xFF];
uint32 matchcond = nextmatchcond;
uint32 cond = state->action[c];

// Determine whether we can reach act->next.
// If so, advance state and nextmatchcond.
if ((cond & kEmptyAllFlags) == 0 || Satisfy(cond, context, p)) {
  uint32 nextindex = cond >> kIndexShift;
  state = IndexToNode(nodes, statesize, nextindex);
  nextmatchcond = state->matchcond;
} else {
  state = NULL;
  nextmatchcond = kImpossible;
}

// This code section is carefully tuned.
// The goto sequence is about 10% faster than the
// obvious rewrite as a large if statement in the
// ASCIIMatchRE2 and DotMatchRE2 benchmarks.

// Saving the match capture registers is expensive.
// Is this intermediate match worth thinking about?

// Not if we want a full match.
if (kind == kFullMatch)
  goto skipmatch;

// Not if it's impossible.
if (matchcond == kImpossible)
  goto skipmatch;

// Not if the possible match is beaten by the certain
// match at the next byte.  When this test is useless
// (e.g., HTTPPartialMatchRE2) it slows the loop by
// about 10%, but when it avoids work (e.g., DotMatchRE2),
// it cuts the loop execution by about 45%.
if ((cond & kMatchWins) == 0 && (nextmatchcond & kEmptyAllFlags) == 0)
  goto skipmatch;

// Finally, the match conditions must be satisfied.
if ((matchcond & kEmptyAllFlags) == 0 || Satisfy(matchcond, context, p)) {
  for (int i = 2; i < 2*nmatch; i++)
    matchcap[i] = cap[i];
  if (nmatch > 1 && (matchcond & kCapMask))
    ApplyCaptures(matchcond, p, matchcap, ncap);
  matchcap[1] = p;
  matched = true;

  // If we're in longest match mode, we have to keep
  // going and see if we find a longer match.
  // In first match mode, we can stop if the match
  // takes priority over the next state for this input byte.
  // That bit is per-input byte and thus in cond, not matchcond.
  if (kind == kFirstMatch && (cond & kMatchWins))
    goto done;
}

skipmatch:
if (state == NULL)
goto done;
if ((cond & kCapMask) && nmatch > 1)
ApplyCaptures(cond, p, cap, ncap);
}

// Look for match at end of input.
{
uint32 matchcond = state->matchcond;
if (matchcond != kImpossible &&
((matchcond & kEmptyAllFlags) == 0 || Satisfy(matchcond, context, p))) {
if (nmatch > 1 && (matchcond & kCapMask))
ApplyCaptures(matchcond, p, cap, ncap);
for (int i = 2; i < ncap; i++)
matchcap[i] = cap[i];
matchcap[1] = p;
matched = true;
}
}

done:
if (!matched)
return false;
for (int i = 0; i < nmatch; i++)
match[i].set(matchcap[2_i],
static_cast(matchcap[2_i+1] - matchcap[2*i]));
return true;
}

// Analysis to determine whether a given regexp program is one-pass.

// If ip is not on workq, adds ip to work queue and returns true.
// If ip is already on work queue, does nothing and returns false.
// If ip is NULL, does nothing and returns true (pretends to add it).
typedef SparseSet Instq;
static bool AddQ(Instq *q, int id) {
if (id == 0)
return true;
if (q->contains(id))
return false;
q->insert(id);
return true;
}

struct InstCond {
int id;
uint32 cond;
};

// Returns whether this is a one-pass program; that is,
// returns whether it is safe to use SearchOnePass on this program.
// These conditions must be true for any instruction ip:
//
// (1) for any other Inst nip, there is at most one input-free
// path from ip to nip.
// (2) there is at most one kInstByte instruction reachable from
// ip that matches any particular byte c.
// (3) there is at most one input-free path from ip to a kInstMatch
// instruction.
//
// This is actually just a conservative approximation: it might
// return false when the answer is true, when kInstEmptyWidth
// instructions are involved.
// Constructs and saves corresponding one-pass NFA on success.
bool Prog::IsOnePass() {
if (did_onepass_)
return onepass_start_ != NULL;
did_onepass_ = true;

if (start() == 0) // no match
return false;

// Steal memory for the one-pass NFA from the overall DFA budget.
// Willing to use at most 1/4 of the DFA budget (heuristic).
// Limit max node count to 65000 as a conservative estimate to
// avoid overflowing 16-bit node index in encoding.
int maxnodes = 2 + inst_count(kInstByteRange);
int statesize = sizeof(OneState) + bytemap_range_*sizeof(uint32);
if (maxnodes >= 65000 || dfa_mem_ / 4 / statesize < maxnodes)
return false;

// Flood the graph starting at the start state, and check
// that in each reachable state, each possible byte leads
// to a unique next state.
int stacksize = inst_count(kInstCapture) +
inst_count(kInstEmptyWidth) +
inst_count(kInstNop) + 1; // + 1 for start inst
InstCond* stack = new InstCond[stacksize];

int size = this->size();
int* nodebyid = new int[size]; // indexed by ip
memset(nodebyid, 0xFF, size*sizeof nodebyid[0]);

uint8* nodes = new uint8[maxnodes_statesize];
uint8_ nodep = nodes;

Instq tovisit(size), workq(size);
AddQ(&tovisit, start());
nodebyid[start()] = 0;
nodep += statesize;
int nalloc = 1;
for (Instq::iterator it = tovisit.begin(); it != tovisit.end(); ++it) {
int id = it;
int nodeindex = nodebyid[id];
OneState
node = IndexToNode(nodes, statesize, nodeindex);

// Flood graph using manual stack, filling in actions as found.
// Default is none.
for (int b = 0; b < bytemap_range_; b++)
  node->action[b] = kImpossible;
node->matchcond = kImpossible;

workq.clear();
bool matched = false;
int nstack = 0;
stack[nstack].id = id;
stack[nstack++].cond = 0;
while (nstack > 0) {
  int id = stack[--nstack].id;
  uint32 cond = stack[nstack].cond;

Loop:
  Prog::Inst* ip = inst(id);
  switch (ip->opcode()) {
    default:
      LOG(DFATAL) << "unhandled opcode: " << ip->opcode();
      break;

    case kInstAltMatch:
      // TODO(rsc): Ignoring kInstAltMatch optimization.
      // Should implement it in this engine, but it's subtle.
      DCHECK(!ip->last());
      // If already on work queue, (1) is violated: bail out.
      if (!AddQ(&workq, id+1))
        goto fail;
      id = id+1;
      goto Loop;

    case kInstByteRange: {
      int nextindex = nodebyid[ip->out()];
      if (nextindex == -1) {
        if (nalloc >= maxnodes) {
          if (Debug)
            LOG(ERROR) << StringPrintf(
                "Not OnePass: hit node limit %d > %d", nalloc, maxnodes);
          goto fail;
        }
        nextindex = nalloc;
        nodep += statesize;
        nodebyid[ip->out()] = nextindex;
        nalloc++;
        AddQ(&tovisit, ip->out());
      }
      for (int c = ip->lo(); c <= ip->hi(); c++) {
        int b = bytemap_[c];
        // Skip any bytes immediately after c that are also in b.
        while (c < 256-1 && bytemap_[c+1] == b)
          c++;
        uint32 act = node->action[b];
        uint32 newact = (nextindex << kIndexShift) | cond;
        if (matched)
          newact |= kMatchWins;
        if ((act & kImpossible) == kImpossible) {
          node->action[b] = newact;
        } else if (act != newact) {
          if (Debug) {
            LOG(ERROR) << StringPrintf(
                "Not OnePass: conflict on byte %#x at state %d", c, *it);
          }
          goto fail;
        }
      }
      if (ip->foldcase()) {
        Rune lo = max<Rune>(ip->lo(), 'a') + 'A' - 'a';
        Rune hi = min<Rune>(ip->hi(), 'z') + 'A' - 'a';
        for (int c = lo; c <= hi; c++) {
          int b = bytemap_[c];
          // Skip any bytes immediately after c that are also in b.
          while (c < 256-1 && bytemap_[c+1] == b)
            c++;
          uint32 act = node->action[b];
          uint32 newact = (nextindex << kIndexShift) | cond;
          if (matched)
            newact |= kMatchWins;
          if ((act & kImpossible) == kImpossible) {
            node->action[b] = newact;
          } else if (act != newact) {
            if (Debug) {
              LOG(ERROR) << StringPrintf(
                  "Not OnePass: conflict on byte %#x at state %d", c, *it);
            }
            goto fail;
          }
        }
      }

      if (ip->last())
        break;
      // If already on work queue, (1) is violated: bail out.
      if (!AddQ(&workq, id+1))
        goto fail;
      id = id+1;
      goto Loop;
    }

    case kInstCapture:
    case kInstEmptyWidth:
    case kInstNop:
      if (!ip->last()) {
        // If already on work queue, (1) is violated: bail out.
        if (!AddQ(&workq, id+1))
          goto fail;
        stack[nstack].id = id+1;
        stack[nstack++].cond = cond;
      }

      if (ip->opcode() == kInstCapture && ip->cap() < kMaxCap)
        cond |= (1 << kCapShift) << ip->cap();
      if (ip->opcode() == kInstEmptyWidth)
        cond |= ip->empty();

      // kInstCapture and kInstNop always proceed to ip->out().
      // kInstEmptyWidth only sometimes proceeds to ip->out(),
      // but as a conservative approximation we assume it always does.
      // We could be a little more precise by looking at what c
      // is, but that seems like overkill.

      // If already on work queue, (1) is violated: bail out.
      if (!AddQ(&workq, ip->out())) {
        if (Debug) {
          LOG(ERROR) << StringPrintf(
              "Not OnePass: multiple paths %d -> %d\n", *it, ip->out());
        }
        goto fail;
      }
      id = ip->out();
      goto Loop;

    case kInstMatch:
      if (matched) {
        // (3) is violated
        if (Debug) {
          LOG(ERROR) << StringPrintf(
              "Not OnePass: multiple matches from %d\n", *it);
        }
        goto fail;
      }
      matched = true;
      node->matchcond = cond;

      if (ip->last())
        break;
      // If already on work queue, (1) is violated: bail out.
      if (!AddQ(&workq, id+1))
        goto fail;
      id = id+1;
      goto Loop;

    case kInstFail:
      break;
  }
}

}

if (Debug) { // For debugging, dump one-pass NFA to LOG(ERROR).
LOG(ERROR) << "bytemap:\n" << DumpByteMap();
LOG(ERROR) << "prog:\n" << Dump();

map<int, int> idmap;
for (int i = 0; i < size; i++)
  if (nodebyid[i] != -1)
    idmap[nodebyid[i]] = i;

string dump;
for (Instq::iterator it = tovisit.begin(); it != tovisit.end(); ++it) {
  int id = *it;
  int nodeindex = nodebyid[id];
  if (nodeindex == -1)
    continue;
  OneState* node = IndexToNode(nodes, statesize, nodeindex);
  StringAppendF(&dump, "node %d id=%d: matchcond=%#x\n",
                nodeindex, id, node->matchcond);
  for (int i = 0; i < bytemap_range_; i++) {
    if ((node->action[i] & kImpossible) == kImpossible)
      continue;
    StringAppendF(&dump, "  %d cond %#x -> %d id=%d\n",
                  i, node->action[i] & 0xFFFF,
                  node->action[i] >> kIndexShift,
                  idmap[node->action[i] >> kIndexShift]);
  }
}
LOG(ERROR) << "nodes:\n" << dump;

}

// Overallocated earlier; cut down to actual size.
nodep = new uint8[nalloc_statesize];
memmove(nodep, nodes, nalloc_statesize);
delete[] nodes;
nodes = nodep;

onepass_start_ = IndexToNode(nodes, statesize, nodebyid[start()]);
onepass_nodes_ = nodes;
onepass_statesize_ = statesize;
dfa_mem_ -= nalloc*statesize;

delete[] stack;
delete[] nodebyid;
return true;

fail:
delete[] stack;
delete[] nodebyid;
delete[] nodes;
return false;
}

} // namespace re2

// Copyright 2008 The RE2 Authors. All Rights Reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

// A DFA (deterministic finite automaton)-based regular expression search.
//
// The DFA search has two main parts: the construction of the automaton,
// which is represented by a graph of State structures, and the execution
// of the automaton over a given input string.
//
// The basic idea is that the State graph is constructed so that the
// execution can simply start with a state s, and then for each byte c in
// the input string, execute "s = s->next[c]", checking at each point whether
// the current s represents a matching state.
//
// The simple explanation just given does convey the essence of this code,
// but it omits the details of how the State graph gets constructed as well
// as some performance-driven optimizations to the execution of the automaton.
// All these details are explained in the comments for the code following
// the definition of class DFA.
//
// See http://swtch.com/~rsc/regexp/ for a very bare-bones equivalent.

#include "util/flags.h"
#include "util/sparse_set.h"
#include "re2/prog.h"
#include "re2/stringpiece.h"

DEFINE_bool(re2_dfa_bail_when_slow, true,
"Whether the RE2 DFA should bail out early "
"if the NFA would be faster (for testing).");

namespace re2 {

#if !defined(linux) /* only Linux seems to have memrchr /
static void
memrchr(const void* s, int c, size_t n) {
const unsigned char* p = (const unsigned char_)s;
for (p += n; n > 0; n--)
if (_--p == c)
return (void*)p;

return NULL;
}
#endif

// Changing this to true compiles in prints that trace execution of the DFA.
// Generates a lot of output -- only useful for debugging.
static const bool DebugDFA = false;

// A DFA implementation of a regular expression program.
// Since this is entirely a forward declaration mandated by C++,
// some of the comments here are better understood after reading
// the comments in the sections that follow the DFA definition.
class DFA {
public:
DFA(Prog* prog, Prog::MatchKind kind, int64 max_mem);
~DFA();
bool ok() const { return !init_failed_; }
Prog::MatchKind kind() { return kind_; }

// Searches for the regular expression in text, which is considered
// as a subsection of context for the purposes of interpreting flags
// like ^ and $ and \A and \z.
// Returns whether a match was found.
// If a match is found, sets ep to the end point of the best match in text.
// If "anchored", the match must begin at the start of text.
// If "want_earliest_match", the match that ends first is used, not
// necessarily the best one.
// If "run_forward" is true, the DFA runs from text.begin() to text.end().
// If it is false, the DFA runs from text.end() to text.begin(),
// returning the leftmost end of the match instead of the rightmost one.
// If the DFA cannot complete the search (for example, if it is out of
// memory), it sets *failed and returns false.
bool Search(const StringPiece& text, const StringPiece& context,
bool anchored, bool want_earliest_match, bool run_forward,
bool
failed, const char** ep, vector* matches);

// Builds out all states for the entire DFA. FOR TESTING ONLY
// Returns number of states.
int BuildAllStates();

// Computes min and max for matching strings. Won't return strings
// bigger than maxlen.
bool PossibleMatchRange(string* min, string* max, int maxlen);

// These data structures are logically private, but C++ makes it too
// difficult to mark them as such.
class Workq;
class RWLocker;
class StateSaver;

// A single DFA state. The DFA is represented as a graph of these
// States, linked by the next_ pointers. If in state s and reading
// byte c, the next state should be s->next_[c].
struct State {
inline bool IsMatch() const { return (flag_ & kFlagMatch) != 0; }
void SaveMatch(vector* v);

int* inst_;         // Instruction pointers in the state.
int ninst_;         // # of inst_ pointers.
uint flag_;         // Empty string bitfield flags in effect on the way
                    // into this state, along with kFlagMatch if this
                    // is a matching state.

#ifdef MSC_VER
#pragma warning(push)
#pragma warning(disable: 4200)
#endif
std::atomic<State*> next
[]; // Outgoing arrows from State,
// one per input byte class
#ifdef _MSC_VER
#pragma warning(pop)
#endif
};

enum {
kByteEndText = 256, // imaginary byte at end of text

kFlagEmptyMask = 0xFFF,     // State.flag_: bits holding kEmptyXXX flags
kFlagMatch = 0x1000,        // State.flag_: this is a matching state
kFlagLastWord = 0x2000,     // State.flag_: last byte was a word char
kFlagNeedShift = 16,        // needed kEmpty bits are or'ed in shifted left

};

#ifndef STL_MSVC
// STL function structures for use with unordered_set.
struct StateEqual {
bool operator()(const State* a, const State* b) const {
if (a == b)
return true;
if (a == NULL || b == NULL)
return false;
if (a->ninst_ != b->ninst_)
return false;
if (a->flag_ != b->flag_)
return false;
for (int i = 0; i < a->ninst_; i++)
if (a->inst_[i] != b->inst_[i])
return false;
return true; // they're equal
}
};
#endif // STL_MSVC
struct StateHash {
size_t operator()(const State* a) const {
if (a == NULL)
return 0;
const char* s = reinterpret_cast<const char*>(a->inst_);
int len = a->ninst_ * sizeof a->inst_[0];
if (sizeof(size_t) == sizeof(uint32))
return Hash32StringWithSeed(s, len, a->flag_);
else
return static_cast<size_t>(Hash64StringWithSeed(s, len, a->flag_));
}
#ifdef STL_MSVC
// Less than operator.
bool operator()(const State* a, const State* b) const {
if (a == b)
return false;
if (a == NULL || b == NULL)
return a == NULL;
if (a->ninst_ != b->ninst_)
return a->ninst_ < b->ninst_;
if (a->flag_ != b->flag_)
return a->flag_ < b->flag_;
for (int i = 0; i < a->ninst_; ++i)
if (a->inst_[i] != b->inst_[i])
return a->inst_[i] < b->inst_[i];
return false; // they're equal
}
// The two public members are required by msvc. 4 and 8 are default values.
// Reference: http://msdn.microsoft.com/en-us/library/1s1byw77.aspx
static const size_t bucket_size = 4;
static const size_t min_buckets = 8;
#endif // STL_MSVC
};

#ifdef STL_MSVC
typedef unordered_set<State*, StateHash> StateSet;
#else // !STL_MSVC
typedef unordered_set<State*, StateHash, StateEqual> StateSet;
#endif // STL_MSVC

private:
// Special "firstbyte" values for a state. (Values >= 0 denote actual bytes.)
enum {
kFbUnknown = -1, // No analysis has been performed.
kFbMany = -2, // Many bytes will lead out of this state.
kFbNone = -3, // No bytes lead out of this state.
};

enum {
// Indices into start_ for unanchored searches.
// Add kStartAnchored for anchored searches.
kStartBeginText = 0, // text at beginning of context
kStartBeginLine = 2, // text at beginning of line
kStartAfterWordChar = 4, // text follows a word character
kStartAfterNonWordChar = 6, // text follows non-word character
kMaxStart = 8,

kStartAnchored = 1,

};

// Resets the DFA State cache, flushing all saved State* information.
// Releases and reacquires cache_mutex_ via cache_lock, so any
// State* existing before the call are not valid after the call.
// Use a StateSaver to preserve important states across the call.
// cache_mutex_.r <= L < mutex_
// After: cache_mutex_.w <= L < mutex_
void ResetCache(RWLocker* cache_lock);

// Looks up and returns the State corresponding to a Workq.
// L >= mutex_
State* WorkqToCachedState(Workq* q, uint flag);

// Looks up and returns a State matching the inst, ninst, and flag.
// L >= mutex_
State* CachedState(int* inst, int ninst, uint flag);

// Clear the cache entirely.
// Must hold cache_mutex_.w or be in destructor.
void ClearCache();

// Converts a State into a Workq: the opposite of WorkqToCachedState.
// L >= mutex_
void StateToWorkq(State* s, Workq* q);

// Runs a State on a given byte, returning the next state.
State* RunStateOnByteUnlocked(State_, int); // cache_mutex_.r <= L < mutex_
State_ RunStateOnByte(State*, int); // L >= mutex_

// Runs a Workq on a given byte followed by a set of empty-string flags,
// producing a new Workq in nq. If a match instruction is encountered,
// sets ismatch to true.
// L >= mutex

void RunWorkqOnByte(Workq_ q, Workq* nq,
int c, uint flag, bool* ismatch,
Prog::MatchKind kind);

// Runs a Workq on a set of empty-string flags, producing a new Workq in nq.
// L >= mutex_
void RunWorkqOnEmptyString(Workq* q, Workq* nq, uint flag);

// Adds the instruction id to the Workq, following empty arrows
// according to flag.
// L >= mutex_
void AddToQueue(Workq* q, int id, uint flag);

// For debugging, returns a text representation of State.
static string DumpState(State* state);

// For debugging, returns a text representation of a Workq.
static string DumpWorkq(Workq* q);

// Search parameters
struct SearchParams {
SearchParams(const StringPiece& text, const StringPiece& context,
RWLocker* cache_lock)
: text(text), context(context),
anchored(false),
want_earliest_match(false),
run_forward(false),
start(NULL),
firstbyte(kFbUnknown),
cache_lock(cache_lock),
failed(false),
ep(NULL),
matches(NULL) { }

StringPiece text;
StringPiece context;
bool anchored;
bool want_earliest_match;
bool run_forward;
State* start;
int firstbyte;
RWLocker *cache_lock;
bool failed;     // "out" parameter: whether search gave up
const char* ep;  // "out" parameter: end pointer for match
vector<int>* matches;

private:
DISALLOW_COPY_AND_ASSIGN(SearchParams);
};

// Before each search, the parameters to Search are analyzed by
// AnalyzeSearch to determine the state in which to start and the
// "firstbyte" for that state, if any.
struct StartInfo {
StartInfo() : start(NULL), firstbyte(kFbUnknown) { }
State* start;
std::atomic firstbyte;
};

// Fills in params->start and params->firstbyte using
// the other search parameters. Returns true on success,
// false on failure.
// cache_mutex_.r <= L < mutex_
bool AnalyzeSearch(SearchParams* params);
bool AnalyzeSearchHelper(SearchParams* params, StartInfo* info, uint flags);

// The generic search loop, inlined to create specialized versions.
// cache_mutex_.r <= L < mutex_
// Might unlock and relock cache_mutex_ via params->cache_lock.
inline bool InlinedSearchLoop(SearchParams* params,
bool have_firstbyte,
bool want_earliest_match,
bool run_forward);

// The specialized versions of InlinedSearchLoop. The three letters
// at the ends of the name denote the true/false values used as the
// last three parameters of InlinedSearchLoop.
// cache_mutex_.r <= L < mutex_
// Might unlock and relock cache_mutex_ via params->cache_lock.
bool SearchFFF(SearchParams* params);
bool SearchFFT(SearchParams* params);
bool SearchFTF(SearchParams* params);
bool SearchFTT(SearchParams* params);
bool SearchTFF(SearchParams* params);
bool SearchTFT(SearchParams* params);
bool SearchTTF(SearchParams* params);
bool SearchTTT(SearchParams* params);

// The main search loop: calls an appropriate specialized version of
// InlinedSearchLoop.
// cache_mutex_.r <= L < mutex_
// Might unlock and relock cache_mutex_ via params->cache_lock.
bool FastSearchLoop(SearchParams* params);

// For debugging, a slow search loop that calls InlinedSearchLoop
// directly -- because the booleans passed are not constants, the
// loop is not specialized like the SearchFFF etc. versions, so it
// runs much more slowly. Useful only for debugging.
// cache_mutex_.r <= L < mutex_
// Might unlock and relock cache_mutex_ via params->cache_lock.
bool SlowSearchLoop(SearchParams* params);

// Looks up bytes in bytemap_ but handles case c == kByteEndText too.
int ByteMap(int c) {
if (c == kByteEndText)
return prog_->bytemap_range();
return prog_->bytemap()[c];
}

// Constant after initialization.
Prog* prog_; // The regular expression program to run.
Prog::MatchKind kind_; // The kind of DFA.
bool init_failed_; // initialization failed (out of memory)

Mutex mutex_; // mutex_ >= cache_mutex_.r

// Scratch areas, protected by mutex_.
Workq* q0_; // Two pre-allocated work queues.
Workq* q1_;
int* astack_; // Pre-allocated stack for AddToQueue
int nastack_;

// State* cache. Many threads use and add to the cache simultaneously,
// holding cache_mutex_ for reading and mutex_ (above) when adding.
// If the cache fills and needs to be discarded, the discarding is done
// while holding cache_mutex_ for writing, to avoid interrupting other
// readers. Any State* pointers are only valid while cache_mutex_
// is held.
Mutex cache_mutex_;
int64 mem_budget_; // Total memory budget for all States.
int64 state_budget_; // Amount of memory remaining for new States.
StateSet state_cache_; // All States computed so far.
StartInfo start_[kMaxStart];
bool cache_warned_; // have printed to LOG(INFO) about the cache
};

// Shorthand for casting to uint8_.
static inline const uint8_ BytePtr(const void* v) {
return reinterpret_cast<const uint8*>(v);
}

// Work queues

// Marks separate thread groups of different priority
// in the work queue when in leftmost-longest matching mode.
#define Mark (-1)

// Internally, the DFA uses a sparse array of
// program instruction pointers as a work queue.
// In leftmost longest mode, marks separate sections
// of workq that started executing at different
// locations in the string (earlier locations first).
class DFA::Workq : public SparseSet {
public:
// Constructor: n is number of normal slots, maxmark number of mark slots.
Workq(int n, int maxmark) :
SparseSet(n+maxmark),
n_(n),
maxmark_(maxmark),
nextmark_(n),
last_was_mark_(true) {
}

bool is_mark(int i) { return i >= n_; }

int maxmark() { return maxmark_; }

void clear() {
SparseSet::clear();
nextmark_ = n_;
}

void mark() {
if (last_was_mark_)
return;
last_was_mark_ = false;
SparseSet::insert_new(nextmark_++);
}

int size() {
return n_ + maxmark_;
}

void insert(int id) {
if (contains(id))
return;
insert_new(id);
}

void insert_new(int id) {
last_was_mark_ = false;
SparseSet::insert_new(id);
}

private:
int n_; // size excluding marks
int maxmark_; // maximum number of marks
int nextmark_; // id of next mark
bool last_was_mark_; // last inserted was mark
DISALLOW_COPY_AND_ASSIGN(Workq);
};

DFA::DFA(Prog* prog, Prog::MatchKind kind, int64 max_mem)
: prog_(prog),
kind_(kind),
init_failed_(false),
q0_(NULL),
q1_(NULL),
astack_(NULL),
mem_budget_(max_mem),
cache_warned_(false) {
if (DebugDFA)
fprintf(stderr, "\nkind %d\n%s\n", (int)kind_, prog_->DumpUnanchored().c_str());
int nmark = 0;
if (kind_ == Prog::kLongestMatch)
nmark = prog_->size();
// See DFA::AddToQueue() for why this is so.
nastack_ = prog_->inst_count(kInstCapture) +
prog_->inst_count(kInstEmptyWidth) +
prog_->inst_count(kInstNop) +
nmark + 1; // + 1 for start inst

// Account for space needed for DFA, q0, q1, astack.
mem_budget_ -= sizeof(DFA);
mem_budget_ -= (prog_->size() + nmark) *
(sizeof(int)+sizeof(int)) * 2; // q0, q1
mem_budget_ -= nastack_ * sizeof(int); // astack
if (mem_budget_ < 0) {
LOG(INFO) << StringPrintf("DFA out of memory: prog size %d mem %lld",
prog_->size(), max_mem);
init_failed_ = true;
return;
}

state_budget_ = mem_budget_;

// Make sure there is a reasonable amount of working room left.
// At minimum, the search requires room for two states in order
// to limp along, restarting frequently. We'll get better performance
// if there is room for a larger number of states, say 20.
// Note that a state stores list heads only, so we use the program
// list count for the upper bound, not the program size.
int nnext = prog_->bytemap_range() + 1; // + 1 for kByteEndText slot
int64 one_state = sizeof(State) + nnext_sizeof(std::atomic<State_>) +
(prog_->list_count()+nmark)sizeof(int);
if (state_budget
< 20_one_state) {
LOG(INFO) << StringPrintf("DFA out of memory: prog size %d mem %lld",
prog_->size(), max_mem);
init_failed_ = true;
return;
}

q0_ = new Workq(prog_->size(), nmark);
q1_ = new Workq(prog_->size(), nmark);
astack_ = new int[nastack_];
}

DFA::~DFA() {
delete q0_;
delete q1_;
delete[] astack_;
ClearCache();
}

// In the DFA state graph, s->next[c] == NULL means that the
// state has not yet been computed and needs to be. We need
// a different special value to signal that s->next[c] is a
// state that can never lead to a match (and thus the search
// can be called off). Hence DeadState.
#define DeadState reinterpret_cast<State*>(1)

// Signals that the rest of the string matches no matter what it is.
#define FullMatchState reinterpret_cast<State*>(2)

#define SpecialStateMax FullMatchState

// Debugging printouts

// For debugging, returns a string representation of the work queue.
string DFA::DumpWorkq(Workq* q) {
string s;
const char* sep = "";
for (DFA::Workq::iterator it = q->begin(); it != q->end(); ++it) {
if (q->is_mark(*it)) {
StringAppendF(&s, "|");
sep = "";
} else {
StringAppendF(&s, "%s%d", sep, *it);
sep = ",";
}
}
return s;
}

// For debugging, returns a string representation of the state.
string DFA::DumpState(State* state) {
if (state == NULL)
return "";
if (state == DeadState)
return "X";
if (state == FullMatchState)
return "
";
string s;
const char_ sep = "";
StringAppendF(&s, "(%p)", state);
for (int i = 0; i < state->ninst_; i++) {
if (state->inst_[i] == Mark) {
StringAppendF(&s, "|");
sep = "";
} else {
StringAppendF(&s, "%s%d", sep, state->inst_[i]);
sep = ",";
}
}
StringAppendF(&s, " flag=%#x", state->flag_);
return s;
}

//////////////////////////////////////////////////////////////////////
//
// DFA state graph construction.
//
// The DFA state graph is a heavily-linked collection of State* structures.
// The state_cache_ is a set of all the State structures ever allocated,
// so that if the same state is reached by two different paths,
// the same State structure can be used. This reduces allocation
// requirements and also avoids duplication of effort across the two
// identical states.
//
// A State is defined by an ordered list of instruction ids and a flag word.
//
// The choice of an ordered list of instructions differs from a typical
// textbook DFA implementation, which would use an unordered set.
// Textbook descriptions, however, only care about whether
// the DFA matches, not where it matches in the text. To decide where the
// DFA matches, we need to mimic the behavior of the dominant backtracking
// implementations like PCRE, which try one possible regular expression
// execution, then another, then another, stopping when one of them succeeds.
// The DFA execution tries these many executions in parallel, representing
// each by an instruction id. These pointers are ordered in the State.inst_
// list in the same order that the executions would happen in a backtracking
// search: if a match is found during execution of inst_[2], inst_[i] for i>=3
// can be discarded.
//
// Textbooks also typically do not consider context-aware empty string operators
// like ^ or $. These are handled by the flag word, which specifies the set
// of empty-string operators that should be matched when executing at the
// current text position. These flag bits are defined in prog.h.
// The flag word also contains two DFA-specific bits: kFlagMatch if the state
// is a matching state (one that reached a kInstMatch in the program)
// and kFlagLastWord if the last processed byte was a word character, for the
// implementation of \B and \b.
//
// The flag word also contains, shifted up 16 bits, the bits looked for by
// any kInstEmptyWidth instructions in the state. These provide a useful
// summary indicating when new flags might be useful.
//
// The permanent representation of a State's instruction ids is just an array,
// but while a state is being analyzed, these instruction ids are represented
// as a Workq, which is an array that allows iteration in insertion order.

// NOTE(rsc): The choice of State construction determines whether the DFA
// mimics backtracking implementations (so-called leftmost first matching) or
// traditional DFA implementations (so-called leftmost longest matching as
// prescribed by POSIX). This implementation chooses to mimic the
// backtracking implementations, because we want to replace PCRE. To get
// POSIX behavior, the states would need to be considered not as a simple
// ordered list of instruction ids, but as a list of unordered sets of instruction
// ids. A match by a state in one set would inhibit the running of sets
// farther down the list but not other instruction ids in the same set. Each
// set would correspond to matches beginning at a given point in the string.
// This is implemented by separating different sets with Mark pointers.

// Looks in the State cache for a State matching q, flag.
// If one is found, returns it. If one is not found, allocates one,
// inserts it in the cache, and returns it.
DFA::State* DFA::WorkqToCachedState(Workq* q, uint flag) {
if (DEBUG_MODE)
mutex_.AssertHeld();

// Construct array of instruction ids for the new state.
// Only ByteRange, EmptyWidth, and Match instructions are useful to keep:
// those are the only operators with any effect in
// RunWorkqOnEmptyString or RunWorkqOnByte.
int* inst = new int[q->size()];
int n = 0;
uint needflags = 0; // flags needed by kInstEmptyWidth instructions
bool sawmatch = false; // whether queue contains guaranteed kInstMatch
bool sawmark = false; // whether queue contains a Mark
if (DebugDFA)
fprintf(stderr, "WorkqToCachedState %s [%#x]", DumpWorkq(q).c_str(), flag);
for (Workq::iterator it = q->begin(); it != q->end(); ++it) {
int id = it;
if (sawmatch && (kind
== Prog::kFirstMatch || q->is_mark(id)))
break;
if (q->is_mark(id)) {
if (n > 0 && inst[n-1] != Mark) {
sawmark = true;
inst[n++] = Mark;
}
continue;
}
Prog::Inst_ ip = prog_->inst(id);
switch (ip->opcode()) {
case kInstAltMatch:
// This state will continue to a match no matter what
// the rest of the input is. If it is the highest priority match
// being considered, return the special FullMatchState
// to indicate that it's all matches from here out.
if (kind_ != Prog::kManyMatch &&
(kind_ != Prog::kFirstMatch ||
(it == q->begin() && ip->greedy(prog_))) &&
(kind_ != Prog::kLongestMatch || !sawmark) &&
(flag & kFlagMatch)) {
delete[] inst;
if (DebugDFA)
fprintf(stderr, " -> FullMatchState\n");
return FullMatchState;
}
// Fall through.
default:
// Record iff id is the head of its list, which must
// be the case if id-1 is the last of its list. :)
if (prog_->inst(id-1)->last())
inst[n++] = *it;
if (ip->opcode() == kInstEmptyWidth)
needflags |= ip->empty();
if (ip->opcode() == kInstMatch && !prog_->anchor_end())
sawmatch = true;
break;
}
}
DCHECK_LE(n, q->size());
if (n > 0 && inst[n-1] == Mark)
n--;

// If there are no empty-width instructions waiting to execute,
// then the extra flag bits will not be used, so there is no
// point in saving them. (Discarding them reduces the number
// of distinct states.)
if (needflags == 0)
flag &= kFlagMatch;

// NOTE(rsc): The code above cannot do flag &= needflags,
// because if the right flags were present to pass the current
// kInstEmptyWidth instructions, new kInstEmptyWidth instructions
// might be reached that in turn need different flags.
// The only sure thing is that if there are no kInstEmptyWidth
// instructions at all, no flags will be needed.
// We could do the extra work to figure out the full set of
// possibly needed flags by exploring past the kInstEmptyWidth
// instructions, but the check above -- are any flags needed
// at all? -- handles the most common case. More fine-grained
// analysis can only be justified by measurements showing that
// too many redundant states are being allocated.

// If there are no Insts in the list, it's a dead state,
// which is useful to signal with a special pointer so that
// the execution loop can stop early. This is only okay
// if the state is not a matching state.
if (n == 0 && flag == 0) {
delete[] inst;
if (DebugDFA)
fprintf(stderr, " -> DeadState\n");
return DeadState;
}

// If we're in longest match mode, the state is a sequence of
// unordered state sets separated by Marks. Sort each set
// to canonicalize, to reduce the number of distinct sets stored.
if (kind_ == Prog::kLongestMatch) {
int* ip = inst;
int* ep = ip + n;
while (ip < ep) {
int* markp = ip;
while (markp < ep && *markp != Mark)
markp++;
sort(ip, markp);
if (markp < ep)
markp++;
ip = markp;
}
}

// Save the needed empty-width flags in the top bits for use later.
flag |= needflags << kFlagNeedShift;

State* state = CachedState(inst, n, flag);
delete[] inst;
return state;
}

// Looks in the State cache for a State matching inst, ninst, flag.
// If one is found, returns it. If one is not found, allocates one,
// inserts it in the cache, and returns it.
DFA::State* DFA::CachedState(int* inst, int ninst, uint flag) {
if (DEBUG_MODE)
mutex_.AssertHeld();

// Look in the cache for a pre-existing state.
State state;
state.inst_ = inst;
state.ninst_ = ninst;
state.flag_ = flag;

StateSet::iterator it = state_cache_.find(&state);
if (it != state_cache_.end()) {
if (DebugDFA)
fprintf(stderr, " -cached-> %s\n", DumpState(*it).c_str());
return *it;
}

// Must have enough memory for new state.
// In addition to what we're going to allocate,
// the state cache hash table seems to incur about 32 bytes per
// State_, empirically.
const int kStateCacheOverhead = 32;
int nnext = prog_->bytemap_range() + 1; // + 1 for kByteEndText slot
int mem = sizeof(State) + nnext_sizeof(std::atomic<State*>) +
ninst*sizeof(int);
if (mem_budget_ < mem + kStateCacheOverhead) {
mem_budget_ = -1;
return NULL;
}
mem_budget_ -= mem + kStateCacheOverhead;

// Allocate new state along with room for next_ and inst_.
char* space = new char[mem];
State* s = new (space) State;
(void) new (s->next_) std::atomic<State*>[nnext];
// Work around a unfortunate bug in older versions of libstdc++.
// (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64658)
for (int i = 0; i < nnext; i++)
(void) new (s->next_ + i) std::atomic<State*>(NULL);
s->inst_ = new (s->next_ + nnext) int[ninst];
memmove(s->inst_, inst, ninst*sizeof s->inst_[0]);
s->ninst_ = ninst;
s->flag_ = flag;
if (DebugDFA)
fprintf(stderr, " -> %s\n", DumpState(s).c_str());

// Put state in cache and return it.
state_cache_.insert(s);
return s;
}

// Clear the cache. Must hold cache_mutex_.w or be in destructor.
void DFA::ClearCache() {
// In case state_cache_ doesn't support deleting entries
// during iteration, copy into a vector and then delete.
vector<State*> v;
v.reserve(state_cache_.size());
for (StateSet::iterator it = state_cache_.begin();
it != state_cache_.end(); ++it)
v.push_back(it);
state_cache
.clear();
for (size_t i = 0; i < v.size(); i++)
delete[] reinterpret_cast(v[i]);
}

// Copies insts in state s to the work queue q.
void DFA::StateToWorkq(State* s, Workq* q) {
q->clear();
for (int i = 0; i < s->ninst_; i++) {
if (s->inst_[i] == Mark)
q->mark();
else
// Explore from the head of the list.
AddToQueue(q, s->inst_[i], s->flag_ & kFlagEmptyMask);
}
}

// Adds ip to the work queue, following empty arrows according to flag.
void DFA::AddToQueue(Workq* q, int id, uint flag) {

// Use astack_ to hold our stack of instructions yet to process.
// It was preallocated as follows:
// one entry per Capture;
// one entry per EmptyWidth; and
// one entry per Nop.
// This reflects the maximum number of stack pushes that each can
// perform. (Each instruction can be processed at most once.)
// When using marks, we also added nmark == prog_->size().
// (Otherwise, nmark == 0.)
int* stk = astack_;
int nstk = 0;

stk[nstk++] = id;
while (nstk > 0) {
DCHECK_LE(nstk, nastack_);
id = stk[--nstk];

Loop:
if (id == Mark) {
q->mark();
continue;
}

if (id == 0)
  continue;

// If ip is already on the queue, nothing to do.
// Otherwise add it.  We don't actually keep all the
// ones that get added, but adding all of them here
// increases the likelihood of q->contains(id),
// reducing the amount of duplicated work.
if (q->contains(id))
  continue;
q->insert_new(id);

// Process instruction.
Prog::Inst* ip = prog_->inst(id);
switch (ip->opcode()) {
  default:
    LOG(DFATAL) << "unhandled opcode: " << ip->opcode();
    break;

  case kInstByteRange:  // just save these on the queue
  case kInstMatch:
    if (ip->last())
      break;
    id = id+1;
    goto Loop;

  case kInstCapture:    // DFA treats captures as no-ops.
  case kInstNop:
    if (!ip->last())
      stk[nstk++] = id+1;

    // If this instruction is the [00-FF]* loop at the beginning of
    // a leftmost-longest unanchored search, separate with a Mark so
    // that future threads (which will start farther to the right in
    // the input string) are lower priority than current threads.
    if (ip->opcode() == kInstNop && q->maxmark() > 0 &&
        id == prog_->start_unanchored() && id != prog_->start())
      stk[nstk++] = Mark;
    id = ip->out();
    goto Loop;

  case kInstAltMatch:
    DCHECK(!ip->last());
    id = id+1;
    goto Loop;

  case kInstEmptyWidth:
    if (!ip->last())
      stk[nstk++] = id+1;

    // Continue on if we have all the right flag bits.
    if (ip->empty() & ~flag)
      break;
    id = ip->out();
    goto Loop;
}

}
}

// Running of work queues. In the work queue, order matters:
// the queue is sorted in priority order. If instruction i comes before j,
// then the instructions that i produces during the run must come before
// the ones that j produces. In order to keep this invariant, all the
// work queue runners have to take an old queue to process and then
// also a new queue to fill in. It's not acceptable to add to the end of
// an existing queue, because new instructions will not end up in the
// correct position.

// Runs the work queue, processing the empty strings indicated by flag.
// For example, flag == kEmptyBeginLine|kEmptyEndLine means to match
// both ^ and $. It is important that callers pass all flags at once:
// processing both ^ and $ is not the same as first processing only ^
// and then processing only $. Doing the two-step sequence won't match
// ^$^$^$ but processing ^ and $ simultaneously will (and is the behavior
// exhibited by existing implementations).
void DFA::RunWorkqOnEmptyString(Workq* oldq, Workq* newq, uint flag) {
newq->clear();
for (Workq::iterator i = oldq->begin(); i != oldq->end(); ++i) {
if (oldq->is_mark(*i))
AddToQueue(newq, Mark, flag);
else
AddToQueue(newq, *i, flag);
}
}

// Runs the work queue, processing the single byte c followed by any empty
// strings indicated by flag. For example, c == 'a' and flag == kEmptyEndLine,
// means to match c$. Sets the bool ismatch to true if the end of the
// regular expression program has been reached (the regexp has matched).
void DFA::RunWorkqOnByte(Workq
oldq, Workq* newq,
int c, uint flag, bool* ismatch,
Prog::MatchKind kind) {
if (DEBUG_MODE)
mutex_.AssertHeld();

newq->clear();
for (Workq::iterator i = oldq->begin(); i != oldq->end(); ++i) {
if (oldq->is_mark(_i)) {
if (ismatch)
return;
newq->mark();
continue;
}
int id = i;
Prog::Inst
ip = prog
->inst(id);
switch (ip->opcode()) {
default:
LOG(DFATAL) << "unhandled opcode: " << ip->opcode();
break;

  case kInstFail:        // never succeeds
  case kInstCapture:     // already followed
  case kInstNop:         // already followed
  case kInstAltMatch:    // already followed
  case kInstEmptyWidth:  // already followed
    break;

  case kInstByteRange:   // can follow if c is in range
    if (ip->Matches(c))
      AddToQueue(newq, ip->out(), flag);
    break;

  case kInstMatch:
    if (prog_->anchor_end() && c != kByteEndText)
      break;
    *ismatch = true;
    if (kind == Prog::kFirstMatch) {
      // Can stop processing work queue since we found a match.
      return;
    }
    break;
}

}

if (DebugDFA)
fprintf(stderr, "%s on %d[%#x] -> %s [%d]\n", DumpWorkq(oldq).c_str(),
c, flag, DumpWorkq(newq).c_str(), *ismatch);
}

// Processes input byte c in state, returning new state.
// Caller does not hold mutex.
DFA::State* DFA::RunStateOnByteUnlocked(State* state, int c) {
// Keep only one RunStateOnByte going
// even if the DFA is being run by multiple threads.
MutexLock l(&mutex_);
return RunStateOnByte(state, c);
}

// Processes input byte c in state, returning new state.
DFA::State* DFA::RunStateOnByte(State* state, int c) {
if (DEBUG_MODE)
mutex_.AssertHeld();
if (state <= SpecialStateMax) {
if (state == FullMatchState) {
// It is convenient for routines like PossibleMatchRange
// if we implement RunStateOnByte for FullMatchState:
// once you get into this state you never get out,
// so it's pretty easy.
return FullMatchState;
}
if (state == DeadState) {
LOG(DFATAL) << "DeadState in RunStateOnByte";
return NULL;
}
if (state == NULL) {
LOG(DFATAL) << "NULL state in RunStateOnByte";
return NULL;
}
LOG(DFATAL) << "Unexpected special state in RunStateOnByte";
return NULL;
}

// If someone else already computed this, return it.
State* ns = state->next_[ByteMap(c)].load(std::memory_order_relaxed);
if (ns != NULL)
return ns;

// Convert state into Workq.
StateToWorkq(state, q0_);

// Flags marking the kinds of empty-width things (^ $ etc)
// around this byte. Before the byte we have the flags recorded
// in the State structure itself. After the byte we have
// nothing yet (but that will change: read on).
uint needflag = state->flag_ >> kFlagNeedShift;
uint beforeflag = state->flag_ & kFlagEmptyMask;
uint oldbeforeflag = beforeflag;
uint afterflag = 0;

if (c == '\n') {
// Insert implicit $ and ^ around \n
beforeflag |= kEmptyEndLine;
afterflag |= kEmptyBeginLine;
}

if (c == kByteEndText) {
// Insert implicit $ and \z before the fake "end text" byte.
beforeflag |= kEmptyEndLine | kEmptyEndText;
}

// The state flag kFlagLastWord says whether the last
// byte processed was a word character. Use that info to
// insert empty-width (non-)word boundaries.
bool islastword = (state->flag_ & kFlagLastWord) != 0;
bool isword = (c != kByteEndText && Prog::IsWordChar(static_cast(c)));
if (isword == islastword)
beforeflag |= kEmptyNonWordBoundary;
else
beforeflag |= kEmptyWordBoundary;

// Okay, finally ready to run.
// Only useful to rerun on empty string if there are new, useful flags.
if (beforeflag & ~oldbeforeflag & needflag) {
RunWorkqOnEmptyString(q0_, q1_, beforeflag);
swap(q0_, q1_);
}
bool ismatch = false;
RunWorkqOnByte(q0_, q1_, c, afterflag, &ismatch, kind_);

// Most of the time, we build the state from the output of
// RunWorkqOnByte, so swap q0_ and q1_ here. However, so that
// RE2::Set can tell exactly which match instructions
// contributed to the match, don't swap if c is kByteEndText.
// The resulting state wouldn't be correct for further processing
// of the string, but we're at the end of the text so that's okay.
// Leaving q0_ alone preseves the match instructions that led to
// the current setting of ismatch.
if (c != kByteEndText || kind_ != Prog::kManyMatch)
swap(q0_, q1_);

// Save afterflag along with ismatch and isword in new state.
uint flag = afterflag;
if (ismatch)
flag |= kFlagMatch;
if (isword)
flag |= kFlagLastWord;

ns = WorkqToCachedState(q0_, flag);

// Flush ns before linking to it.
// Write barrier before updating state->next_ so that the
// main search loop can proceed without any locking, for speed.
// (Otherwise it would need one mutex operation per input byte.)
state->next_[ByteMap(c)].store(ns, std::memory_order_release);
return ns;
}

//////////////////////////////////////////////////////////////////////
// DFA cache reset.

// Reader-writer lock helper.
//
// The DFA uses a reader-writer mutex to protect the state graph itself.
// Traversing the state graph requires holding the mutex for reading,
// and discarding the state graph and starting over requires holding the
// lock for writing. If a search needs to expand the graph but is out
// of memory, it will need to drop its read lock and then acquire the
// write lock. Since it cannot then atomically downgrade from write lock
// to read lock, it runs the rest of the search holding the write lock.
// (This probably helps avoid repeated contention, but really the decision
// is forced by the Mutex interface.) It's a bit complicated to keep
// track of whether the lock is held for reading or writing and thread
// that through the search, so instead we encapsulate it in the RWLocker
// and pass that around.

class DFA::RWLocker {
public:
explicit RWLocker(Mutex* mu);
~RWLocker();

// If the lock is only held for reading right now,
// drop the read lock and re-acquire for writing.
// Subsequent calls to LockForWriting are no-ops.
// Notice that the lock is released temporarily.
void LockForWriting();

// Returns whether the lock is already held for writing.
bool IsLockedForWriting() {
return writing_;
}

private:
Mutex* mu_;
bool writing_;

DISALLOW_COPY_AND_ASSIGN(RWLocker);
};

DFA::RWLocker::RWLocker(Mutex* mu)
: mu_(mu), writing_(false) {

mu_->ReaderLock();
}

// This function is marked as NO_THREAD_SAFETY_ANALYSIS because the annotations
// does not support lock up

@junyer
Copy link
Contributor

junyer commented Jul 4, 2016

With this change, and with suppressing the warnings, re2 builds with no
warnings and no errors. The modified files dfa.cc and onepass.cc are
attached.

Thank you for investigating!

Just to confirm, the initialisation change was necessary to satisfy MSVC?

@DThiebaud
Copy link
Author

On 7/4/2016 12:19 AM, Paul Wankadia wrote:

With this change, and with suppressing the warnings, re2 builds
with no
warnings and no errors. The modified files dfa.cc and onepass.cc are
attached.

Thank you for investigating!

Just to confirm, the initialisation change was necessary to satisfy MSVC?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#99 (comment), or
mute the thread
https://github.com/notifications/unsubscribe/ABpGnkC423Qw_AGN2alKFF3zKYKVzVQ7ks5qSInOgaJpZM4JC_r4.

Yes, the initialization change was necessary.

@junyer
Copy link
Contributor

junyer commented Jul 4, 2016

Okay, thanks!

@DThiebaud
Copy link
Author

After the last commit, re2 builds with no errors and no warnings and all tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants