Skip to content
This repository has been archived by the owner on Jul 29, 2023. It is now read-only.

Use threading for PPMd8 decoders #33

Merged
merged 27 commits into from Aug 8, 2021
Merged

Conversation

miurahr
Copy link
Owner

@miurahr miurahr commented Aug 6, 2021

Partial fix for #28

Todo:

  • support threading ppmd8 decode CPython on linux/mac and pthread
  • support threading ppmd8 decode PyPy-CFFI on linux/mac and pthread
  • support threading ppmd8 decode CPython on Windows

Issues:

  • Abort on PyPy when some conditions

Signed-off-by: Hiroshi Miura miurahr@linux.com

Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr miurahr force-pushed the topic-refactoring-buffer-source branch from ee3f73e to bbe031c Compare August 6, 2021 14:05
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr miurahr changed the title Move functions to src/ext WIP: Use threading for decoders Aug 7, 2021
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr miurahr added enhancement New feature or request help wanted Extra attention is needed labels Aug 7, 2021
@miurahr

This comment has been minimized.

@miurahr
Copy link
Owner Author

miurahr commented Aug 7, 2021

@cielavenir could you mind to review changes, test it, and help the progress?

Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr miurahr force-pushed the topic-refactoring-buffer-source branch from e8a589c to e1ad23e Compare August 7, 2021 05:38
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr miurahr changed the title WIP: Use threading for decoders Use threading for decoders Aug 7, 2021
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@cielavenir
Copy link
Contributor

@miurahr it is too large to review, but at least tests are working.

Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr
Copy link
Owner Author

miurahr commented Aug 7, 2021

The core functions to be reviewed are three in lib2/Ppmd8Tdecoder.c :

Byte TReader(const void *p)
is called from PPMd library under lib/ folder. It send signal empty when buffer exhausted, and wait signal notEmpty.

static void * Ppmd8T_decode_run(void *p)

is a function that run in the thread. the core logic comes from src/ext/_ppmdmodule.c L1440-L1470 and protect critical part with mutex.

int Ppmd8T_decode()
is a function that start and handle thread, and return when input buffer is empty.
It also handles signals and detect state of _run function.

Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Copy link
Owner Author

@miurahr miurahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several space to be improved.

lib2/Ppmd8Tdecoder.c Outdated Show resolved Hide resolved
lib2/Ppmd8Tdecoder.c Outdated Show resolved Hide resolved
lib2/Ppmd8Tdecoder.c Outdated Show resolved Hide resolved
lib2/Ppmd8Tdecoder.c Outdated Show resolved Hide resolved
lib2/Ppmd8Tdecoder.c Outdated Show resolved Hide resolved
lib2/Ppmd8Tdecoder.c Outdated Show resolved Hide resolved
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr
Copy link
Owner Author

miurahr commented Aug 7, 2021

Windows fatal exception: access violation errors are reproduced on Windows test, there may be a bug but it is too difficult to debug from log.

@cielavenir
Copy link
Contributor

cielavenir commented Aug 7, 2021

Actually I have an issue:

I found that PPMD_pthread_cond_wait1(&inEmpty, &mutex) could fail to receive inEmpty signal. I have not investigated deeply but it seems like that signal was sent during pthread_cond_timedwait's relocking (here relocking would be done because of pthread_cond_wait(&notEmpty, &mutex)).

This happens because pthread_cond_timedwait is not FIFO, so I wrote simple implementation in cielavenir@f224a04 (part of UseEndmarkProperly2).

@cielavenir
Copy link
Contributor

cielavenir commented Aug 7, 2021

Another question: did you test FFI version? The test does not run here.

Inserting cielavenir@43791a5 worked.

Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr
Copy link
Owner Author

miurahr commented Aug 8, 2021

I found that PPMD_pthread_cond_wait1(&inEmpty, &mutex) could fail to receive inEmpty signal. I have not investigated deeply but it seems like that signal was sent during pthread_cond_timedwait's relocking (here relocking would be done because of pthread_cond_wait(&notEmpty, &mutex)).

This happens because pthread_cond_timedwait is not FIFO, so I wrote simple implementation in cielavenir/pyppmd-py2@f224a04 (part of UseEndmarkProperly2).

Thank you for the suggestion!
I don't prefer global variable other than mutex lock one, so I'd like to rewrite based on your suggestion.

- when too many loop become abort
- check buffer status through shared variables
- introduce nanosleep for windows
- drop usage of pthread_cond_timedwait

Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@cielavenir
Copy link
Contributor

cielavenir commented Aug 8, 2021

@miurahr see 7e707cc and 92d52df

squashing these two commits will become this (collapsed)
diff --git a/lib2/Ppmd8Tdecoder.c b/lib2/Ppmd8Tdecoder.c
index c3c257b..27717f3 100644
--- a/lib2/Ppmd8Tdecoder.c
+++ b/lib2/Ppmd8Tdecoder.c
@@ -5,12 +5,12 @@
 #include "Ppmd8Tdecoder.h"
 
 PPMD_pthread_mutex_t mutex;
-PPMD_pthread_cond_t finished, inEmpty, notEmpty;
+PPMD_pthread_cond_t addInput, notEmpty;
 
 Byte TReader(const void *p) {
     BufferReader *bufferReader = (BufferReader *)p;
     if (bufferReader->inBuffer->pos == bufferReader->inBuffer->size) {
-        PPMD_pthread_cond_signal(&inEmpty);
+        PPMD_pthread_cond_signal(&addInput);
         PPMD_pthread_cond_wait(&notEmpty, &mutex);
     }
     return *((const Byte *)bufferReader->inBuffer->src + bufferReader->inBuffer->pos++);
@@ -18,8 +18,7 @@ Byte TReader(const void *p) {
 
 Bool Ppmd8T_decode_init() {
     PPMD_pthread_mutex_init(&mutex, NULL);
-    PPMD_pthread_cond_init(&finished, NULL);
-    PPMD_pthread_cond_init(&inEmpty, NULL);
+    PPMD_pthread_cond_init(&addInput, NULL);
     PPMD_pthread_cond_init(&notEmpty, NULL);
     return True;
 }
@@ -69,6 +68,7 @@ Ppmd8T_decode_run(void *p) {
     PPMD_pthread_mutex_lock(&mutex);
     args->result = result;
     args->finished = True;
+    PPMD_pthread_cond_signal(&addInput);
     PPMD_pthread_mutex_unlock(&mutex);
     return NULL;
 }
@@ -101,6 +101,7 @@ int Ppmd8T_decode(CPpmd8 *cPpmd8, OutBuffer *out, int max_length, ppmd8_args *ar
             return -2;  // error
         }
     }
+#if 0
     while(True) {
         PPMD_pthread_mutex_lock(&mutex);
         if (PPMD_pthread_cond_wait1(&inEmpty, &mutex) == 0) {
@@ -114,6 +115,13 @@ int Ppmd8T_decode(CPpmd8 *cPpmd8, OutBuffer *out, int max_length, ppmd8_args *ar
         }
         PPMD_pthread_mutex_unlock(&mutex);
     }
-    PPMD_pthread_join(args->handle, NULL);
-    return args->result;
+#endif
+    PPMD_pthread_mutex_lock(&mutex);
+    PPMD_pthread_cond_wait(&addInput, &mutex);
+    PPMD_pthread_mutex_unlock(&mutex);
+    if (args->finished) {
+        PPMD_pthread_join(args->handle, NULL);
+        return args->result;
+    }
+    return 0;
 }

@miurahr miurahr changed the title Use threading for decoders Use threading for PPMd8 decoders Aug 8, 2021
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr miurahr force-pushed the topic-refactoring-buffer-source branch from 320328b to 9d048ff Compare August 8, 2021 10:06
@miurahr
Copy link
Owner Author

miurahr commented Aug 8, 2021

I'd like to handle pypy3 case, infinite loop to wait thread finish, on another issue tracker.

@cielavenir
Copy link
Contributor

maybe please merge this as it is and I'll revisit my UseEndmarkProperly branch later

@miurahr miurahr merged commit aa6298d into main Aug 8, 2021
miurahr added a commit that referenced this pull request Aug 9, 2021
miurahr added a commit that referenced this pull request Aug 15, 2021
Added
-----
* PPMd8: support endmark option(#39)
* PPMd8: support restore_method option(#24, @cielavenir)
* Add pthread wrapper for macOS and Windows(#33)

Changed
-------
* PPMd8: decompressor use threading(#24,#33)

Fixed
-----
* PPMd8: Decompressor become wrong status when memory_size is smaller than file size(#24,#25,#28,#33,#45,#46)
* PPMd8: Decompressor allocate buffers by PyMem_Malloc() (#42)
* CMake: support CFFI extension generation(#30)
* CMake: support debug flag for extension development(#27)
* CMake: support pytest_runner on windows
* CI: run tox test on pull_request

Signed-off-by: Hiroshi Miura <miurahr@linux.com>
@miurahr miurahr deleted the topic-refactoring-buffer-source branch October 3, 2021 07:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants