Skip to content

Commit

Permalink
add jit related functions
Browse files Browse the repository at this point in the history
  • Loading branch information
neurobin committed Jan 23, 2017
1 parent 964f299 commit 5838c18
Show file tree
Hide file tree
Showing 6 changed files with 102 additions and 116 deletions.
2 changes: 2 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Version 10.29.01 - Mon Jan 23 12:23:12 UTC 2017
4. Object instantiation of select class is prohibited.
5. The behavior of shorthand `match()` and `replace()` function in the Regex class has changed. When they are called with no argument they will use previously set options, but when they are called with arguments, they will initiate a temporary match/replace object and will not use (or change) any previous options. This temporary object will not affect any class variables (i.e previously set option) and it won't be available after returning the result.
6. `RegexMatch::match()` and `RegexReplace::replace()` function will no longer take any argument.
7. Add `RegexReplace::setMatchContext()` and `RegexReplace::setMatchData()` function.
8. Add `RegexMatch::setJitStackSize()` and `RegexMatch::freeUnusedJitMemory()` function.


Version 10.28.12 - Sun Jan 22 10:34:35 UTC 2016
Expand Down
25 changes: 10 additions & 15 deletions doxy/doxydoc.md
Original file line number Diff line number Diff line change
Expand Up @@ -563,11 +563,11 @@ If you do experiment with various erroneous situations, make use of the `resetEr
# Multi threading {#multi-threading}

1. There is no data race between two separate objects (`Regex`, `RegexMatch` and `RegexReplace`) because the classes do not contain any static variables.
2. Temporary class objects will always be thread safe as no jpcre2 class uses any thread unsafe functions.
2. Temporary class objects will always be thread safe as no jpcre2 class uses any thread unsafe functions except the `Regex::compile()` function when doing JIT compilation. If JIT compile is not required, this function is thread safe too.
3. Temporary class object that uses another third party class object reference or pointer is thread safe provided that the third party object is thread safe i.e its thread safety is defined by the thread safety of the third party object reference or pointer.
4. All member functions of all classes are thread safe provided that the object calling them are thread safe except the `Regex::compile()` function when doing JIT compilation. If JIT compile is not required, this function is thread safe too.
5. Simultaneous access of the same object is MT unsafe. You can use lock and mutex to ensure thread safety.
6. Class objects must be local to each thread to ensure thread safety. Thus with `>=C++11`, you can make it thread safe just by declaring the class objects with `thread_local`
5. Simultaneous access of the same object is MT unsafe. You can make them `thread_local` or use mutex lock or other mechanisms to ensure thread safety.
6. Class objects must be local to each thread to ensure thread safety. Thus with `>=C++11`, you can make it thread safe just by declaring the class objects as `thread_local`

**Examples:**

Expand Down Expand Up @@ -614,6 +614,13 @@ void *thread_safe_fun3(void *arg){ //uses thread_local global variable 'rec1', t
}
```
The following is MT unsafe as it performs JIT compilation:
```cpp
thread_local jp::Regex rec2("\\w", "gS");
//S modifier is for JIT compilation.
```


An example multi-threaded program is provided in *src/test_pthread.cpp*. The thread safety of this program is tested with Valgrind (helgrind tool). See <a href="#test-suit">Test suit</a> for more details on the test.

Expand Down Expand Up @@ -788,18 +795,6 @@ jp::Regex("^([^\t]+)\t([^\t]+)$")
> For complete changes see the changelog file
The following are added:

1. `jp::RegexMatch::setMatchStartOffsetVector()`: Vector to store the offsets where matches start in th subjects string.
2. `jp::RegexMatch::setMatchEndOffsetVector()`: Vector to store the offsets where matches end in the subject string.
3. `jp::Regex::resetCharacterTables()`: Reset the charater tables according to current locale.

The following are removed:

1. `getEndOffset()`: In favor of `jp::RegexMatch::setMatchEndOffsetVector()`.
2. Thread unsafe function `setLocale()` and its' correspondings `getLocale()` and `getLocaleTypeId()` in favor of `jp::Regex::resetCharacterTables()`


# Test suit {#test-suit}
Some test programs are written to check for major flaws like segfault, memory leak and crucial input/output validation. Before trying to run the tests, make sure you have all 3 PCRE2 libraries installed on your system.
Expand Down
175 changes: 77 additions & 98 deletions src/jpcre2.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -856,8 +856,8 @@ struct select{
typedef typename Pcre2Type<BS>::MatchData MatchData;
typedef typename Pcre2Type<BS>::GeneralContext GeneralContext;
typedef typename Pcre2Type<BS>::MatchContext MatchContext;
//~ typedef typename Pcre2Type<BS>::JitCallback JitCallback;
//~ typedef typename Pcre2Type<BS>::JitStack JitStack;
typedef typename Pcre2Type<BS>::JitCallback JitCallback;
typedef typename Pcre2Type<BS>::JitStack JitStack;

template<typename T>
static String toString(T); //prevent implicit type conversion of T
Expand Down Expand Up @@ -959,10 +959,10 @@ struct select{
int error_number;
PCRE2_SIZE error_offset;
MatchContext *mcontext;
//Managing jit stack inside class brings thread unsafety
//~ JitStack *jit_stack;
//~ PCRE2_SIZE jit_stack_startsize;
//~ PCRE2_SIZE jit_stack_maxsize;
//Managing jit stack brings thread unsafety
JitStack *jit_stack;
PCRE2_SIZE jit_stack_startsize;
PCRE2_SIZE jit_stack_maxsize;

PCRE2_SIZE _start_offset; //name collision, use _ at start

Expand All @@ -989,29 +989,30 @@ struct select{
if (vec_ntn)
vec_ntn->push_back(*ntn_map);
}
//~ void createJitStack(){
//~ if(jit_stack_startsize)
//~ jit_stack = Pcre2Func<BS>::jit_stack_create(jit_stack_startsize, jit_stack_maxsize, 0);
//~ }
//~ void createMatchContext(){
//~ mcontext = Pcre2Func<BS>::match_context_create(0);
//~ }
//this function does not create jit_stack more than once.
//~ void assignJitStack(){
//~ if(!jit_stack) createJitStack();
//~ if(jit_stack){
//~ if(!mcontext) createMatchContext();
//~ if(mcontext) Pcre2Func<BS>::jit_stack_assign(mcontext, 0, jit_stack);
//~ }
//~ }
//~ void freeJitStack(){
//~ if(jit_stack) Pcre2Func<BS>::jit_stack_free(jit_stack);
//~ jit_stack = 0;
//~ }
//~ void freeMatchContext(){
//~ if(mcontext) Pcre2Func<BS>::match_context_free(mcontext);
//~ mcontext = 0;
//~ }
void createJitStack(){
freeJitStack();
if(jit_stack_startsize) jit_stack = Pcre2Func<BS>::jit_stack_create(jit_stack_startsize, jit_stack_maxsize, 0);
}
void createMatchContext(){
freeMatchContext();
mcontext = Pcre2Func<BS>::match_context_create(0);
}
//this function creates jit_stack if not available.
void assignJitStack(){
if(!jit_stack) createJitStack();
if(jit_stack){//not else
if(!mcontext) createMatchContext();
if(mcontext) Pcre2Func<BS>::jit_stack_assign(mcontext, 0, jit_stack);
}
}
void freeJitStack(){
if(jit_stack) Pcre2Func<BS>::jit_stack_free(jit_stack);
jit_stack = 0;
}
void freeMatchContext(){
if(mcontext) Pcre2Func<BS>::match_context_free(mcontext);
mcontext = 0;
}

void init_vars() {
re = 0;
Expand All @@ -1030,9 +1031,9 @@ struct select{
_start_offset = 0;
m_subject_ptr = &m_subject;
mcontext = 0;
//~ jit_stack = 0;
//~ jit_stack_startsize = 0;
//~ jit_stack_maxsize = 0;
jit_stack = 0;
jit_stack_startsize = 0;
jit_stack_maxsize = 0;
}

void resetMaps(){
Expand Down Expand Up @@ -1069,7 +1070,10 @@ struct select{
error_offset = rm.error_offset;
_start_offset = rm._start_offset;

mcontext = rm.mcontext;
freeMatchContext();
if(rm.mcontext) mcontext = Pcre2Func<BS>::match_context_copy(rm.mcontext);
//no need to copy jit_stack, it will be created if needed
setJitStackSize(rm.jit_stack_startsize, rm.jit_stack_maxsize);

}

Expand Down Expand Up @@ -1120,8 +1124,8 @@ struct select{
delete num_sub;
delete nas_map;
delete ntn_map;
//~ freeMatchContext();
//~ freeJitStack();
freeMatchContext();
freeJitStack();
}

/** Reset all class variables to its default (initial) state.
Expand All @@ -1132,8 +1136,8 @@ struct select{
virtual RegexMatch& reset() {
resetMaps();
m_subject.clear(); //not ptr , external string won't be modified.
//~ freeMatchContext();
//~ freeJitStack();
freeMatchContext();
freeJitStack();
init_vars();
return *this;
}
Expand Down Expand Up @@ -1398,37 +1402,36 @@ struct select{
return *this;
}

///Set the match context.
///You can create match context using the native PCRE2 API.
///The memory is not handled by RegexMatch object and not freed.
///User will be responsible for freeing the memory of the match context.
///@param match_context Pointer to the match context.
///@return Reference to the calling RegexMatch object
virtual RegexMatch& setMatchContext(MatchContext *match_context){
mcontext = match_context;
return *this;
}

//the following is not thread safe. (pcre2_jit_stack_create function is not thread safe.)
//~ ///Set JIT stack size.
//~ ///Some large or complicated pattern may need more than the default stack size (32K).
//~ ///A call to this function will create a new JIT memory on machine stack exclusively for this match object.
//~ ///Any and all copies from this match object will also hold their respective exclusive JIT stack.
//~ ///If this match object is copied into multiple other match objects, it may have a significant memory cost.
//~ ///You can change/reset it to its default state by setting the startsize (first argument) to 0.
//~ ///A call to `RegexMatch::reset()` will reset it to default along with others.
//~ ///@param startsize Starting JIT stack size (usually 32*1024).
//~ ///@param maxsize Maximum size of JIT stack (512*1024 or 1024*1024 should be more than enough). A wrong value, such as less than the startsize will be corrected to startsize.
//~ ///Set the match context.
//~ ///You can create match context using the native PCRE2 API.
//~ ///The memory is not handled by RegexMatch object and not freed.
//~ ///User will be responsible for freeing the memory of the match context.
//~ ///@param match_context Pointer to the match context.
//~ ///@return Reference to the calling RegexMatch object
//~ virtual RegexMatch& setJitStackSize(PCRE2_SIZE startsize, PCRE2_SIZE maxsize){
//~ jit_stack_startsize = startsize;
//~ jit_stack_maxsize = maxsize;
//~ if(jit_stack_maxsize < jit_stack_startsize) jit_stack_maxsize = jit_stack_startsize;
//~ if(jit_stack) freeJitStack();
//~ createJitStack();
//~ virtual RegexMatch& setMatchContext(MatchContext *match_context){
//~ mcontext = match_context;
//~ return *this;
//~ }

//the following is not thread safe. (pcre2_jit_stack_create function is not thread safe.)
///Set JIT stack size.
///Some large or complicated pattern may need more than the default stack size (32K).
///A call to this function will create a new JIT memory on machine stack exclusively for this match object.
///Any and all copies from this match object will also hold their respective exclusive JIT stack.
///If this match object is copied into multiple other match objects, it may have a significant memory cost.
///You can change/reset it to its default state by setting the startsize (first argument) to 0.
///A call to `RegexMatch::reset()` will reset it to default along with others.
///@param startsize Starting JIT stack size (usually 32*1024).
///@param maxsize Maximum size of JIT stack (512*1024 or 1024*1024 should be more than enough). A wrong value, such as less than the startsize will be corrected to startsize.
///@return Reference to the calling RegexMatch object
virtual RegexMatch& setJitStackSize(PCRE2_SIZE startsize, PCRE2_SIZE maxsize){
jit_stack_startsize = startsize;
jit_stack_maxsize = maxsize;
if(jit_stack_maxsize < jit_stack_startsize) jit_stack_maxsize = jit_stack_startsize;
createJitStack();
return *this;
}

/// After a call to this function PCRE2 and JPCRE2 options will be properly set.
/// This function does not initialize or re-initialize options.
/// If you want to set options from scratch, initialize them to 0 before calling this function.
Expand Down Expand Up @@ -1509,11 +1512,11 @@ struct select{
return *this;
}

//~ ///Free unused JIT memory.
//~ virtual RegexMatch& freeUnusedJitMemory(){
//~ Pcre2Func<BS>::jit_free_unused_memory(0);
//~ return *this;
//~ }
///Free unused JIT memory.
virtual RegexMatch& freeUnusedJitMemory(){
Pcre2Func<BS>::jit_free_unused_memory(0);
return *this;
}

/// Perform match operaton using info from class variables and return the match count and
/// store the results in specified vectors.
Expand Down Expand Up @@ -3300,31 +3303,7 @@ struct select{
* */
String replace(const String* mains, const String* repl) {
return RegexReplace(this).setSubject(mains).setReplaceWith(repl).replace();
}

/** @overload
* .
* This action doesn not affect any class variables.
* The temporary object that is created is not further usable.
* @param mains Subject string
* @return Resultant string after regex replace
* @see RegexReplace::replace()
* */
String replace(const String& mains) {
return RegexReplace(this).setSubject(mains).replace();
}

/** @overload
* .
* This action doesn not affect any class variables.
* The temporary object that is created is not further usable.
* @param mains Pointer to subject string
* @return Resultant string after regex replace
* @see RegexReplace::replace()
* */
String replace(const String* mains) {
return RegexReplace(this).setSubject(mains).replace();
}
}

/** Shorthand for getReplaceObject().replace()
* All prevously set options will be used. It's just a short hand
Expand Down Expand Up @@ -3832,10 +3811,10 @@ jpcre2::SIZE_T jpcre2::select<Char_T, BS>::RegexMatch::match() {
if(vec_soff) vec_soff->clear();
if(vec_eoff) vec_eoff->clear();

//~ //check if jit code is available and assign jit stack if user wants it.
//~ SIZE_T jit_size = 0;
//~ Pcre2Func<BS>::pattern_info(re->code, PCRE2_INFO_JITSIZE, &jit_size);
//~ if(jit_size && jit_stack_startsize) assignJitStack();
//check if jit code is available and assign jit stack if user wants it.
SIZE_T jit_size = 0;
Pcre2Func<BS>::pattern_info(re->code, PCRE2_INFO_JITSIZE, &jit_size);
if(jit_size && jit_stack_startsize) assignJitStack();

/* Using this function ensures that the block is exactly the right size for
the number of capturing parentheses in the pattern. */
Expand Down
7 changes: 5 additions & 2 deletions src/test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,11 @@ int main(){
rm.setMatchEndOffsetVector(&vec_eoff); \
re = jp::Regex(PAT, "in"); \
rm.setRegexObject(&re); \
rm.setJitStackSize(32*1024,0); \
rm.setJitStackSize(0,0); \
rm.setJitStackSize(32*1024,0); \
rm.setJitStackSize(32*1024,0); \
rm.freeUnusedJitMemory(); \
rm.setSubject(&text).setModifier("g").match(); \
jp::Regex re4(PAT, "niJS"); \
rm.setRegexObject(&re4); \
Expand Down Expand Up @@ -228,12 +233,10 @@ int main(){
rr3 = jp::RegexReplace(&re2); \
\
rr.replace(); re.replace(); \
re.replace(TEXT); \
re.replace(TEXT, TEXT); \
re.replace(TEXT, &text); \
re.replace(TEXT, TEXT, "g"); \
re.replace(TEXT, &text, "g"); \
re.replace(&text); \
re.replace(&text, TEXT); \
re.replace(&text, TEXT, "g"); \
re.replace(&text, &text); \
Expand Down
Binary file added src/test_pthread
Binary file not shown.
9 changes: 8 additions & 1 deletion src/test_pthread.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ typedef jpcre2::select<char> jp;

pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t mutex2 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t mutex3 = PTHREAD_MUTEX_INITIALIZER;

//this is an example how you can use pre-defined data objects in multithreaded program.
//The logic is to wrap your objects inside another class and initialize them with constructor.
Expand Down Expand Up @@ -55,7 +56,13 @@ void *thread_safe_fun1(void *arg){ //uses no global or static variable, thus thr

void* thread_safe_fun2(void* arg){//uses no global or static variable, thus thread safe.
jp::Regex re("\\w", "g");
jp::RegexMatch rm(&re);
jp::RegexMatch rm(&re);

//jit related functions are thread unsafe
pthread_mutex_lock( &mutex3 );
rm.setJitStackSize(32*1024,0);
pthread_mutex_unlock( &mutex3 );

rm.setSubject("fdsf").setModifier("g").match();
return 0;
}
Expand Down

0 comments on commit 5838c18

Please sign in to comment.