Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash dereferencing freed pointer #951

Closed

Conversation

bmatherly
Copy link
Member

If the producer is deleted by the application before the avformat_producer is freed from the cache, then the avformat_producer destructor might try to access the freed producer.

Hold a reference to the producer until the avformat_producer is also closed.

If the producer is deleted by the application before the avformat_producer
is freed from the cache, then the avformat_producer destructor might
try to access the freed producer.

Hold a reference to the producer until the avformat_producer is also
closed.
@ddennedy
Copy link
Member

It does fix the problem I experienced in the comment on the commit. However, when I run the following unit test

    void GetImageManyChainsDoesNotLeak()
    {
        std::unique_ptr<Mlt::Repository> repository(Mlt::Factory::init());
        Mlt::Profile profile("atsc_1080p_25");
        int ow = 1920;
        int oh = 1080;
        mlt_image_format format = mlt_image_rgba;
        for (int i = 0; i < 1000; i++) {
            Mlt::Chain chain(profile, "avformat", "/home/ddennedy/Videos/chimera-converted.mov");
            Mlt::Frame *frame = chain.get_frame();
            frame->get_image(format, ow, oh);
            delete frame;
        }
    }

On master branch it uses up to 170 MiB of RES (RSS) in htop or top, but on this branch it uses up to 795M. Now, that is probably due to the high frequency of mlt_cache utilization and its eventual cleanup. Now, in a real world test I open a recent Shotcut project with 13 clips, proxy turned on, and play it twice through. This branch reaches up to ~1500M whereas master ~1300. Now, that is without editing, which the process of creates and destroys many producers: open the same project, in a long timeline clip, make 17 splits, play through them, undo all of them, redo all of them, and play through them again. master uses ~1800, and this branch up to ~2230. What do you make of these results?

I do think this is more thread safe since the setting of parent->close = NULL in producer_close() and the testing of it in producer_avformat_close() is not atomic. So, I wondered about the usage of atomic pointers in C (stdatomic.h) and found this interesting thread. Notice, they eventually say even though the C11 standard does support it, not for function pointers, which close is (not to mention in a public struct).

@ddennedy
Copy link
Member

What do you think of this to add thread safety? I do not think the additional reference is really necessary as only the mlt_events_disconnect() needs the mlt_producer, but the producer's events are already disconnected when the producer is destroyed.

--- src/modules/avformat/producer_avformat.c
+++ src/modules/avformat/producer_avformat.c
@@ -124,4 +124,5 @@ struct producer_avformat_s
     pthread_mutex_t packets_mutex;
     pthread_mutex_t open_mutex;
+    pthread_mutex_t close_mutex;
     int is_mutex_init;
     pthread_t packets_thread;
@@ -1034,4 +1035,5 @@ static int producer_open(
         pthread_mutex_init(&self->packets_mutex, &attr);
         pthread_mutex_init(&self->open_mutex, &attr);
+        pthread_mutex_init(&self->close_mutex, &attr);
         self->is_mutex_init = 1;
     }
@@ -3859,6 +3861,8 @@ static void producer_avformat_close(producer_avformat self)
     mlt_log_debug(NULL, "producer_avformat_close\n");
 
+    pthread_mutex_lock(&self->close_mutex);
     if (self->parent && self->parent->close)
         mlt_events_disconnect(MLT_PRODUCER_PROPERTIES(self->parent), self);
+    pthread_mutex_unlock(&self->close_mutex);
 
     // Cleanup av contexts
@@ -3913,4 +3917,5 @@ static void producer_avformat_close(producer_avformat self)
         pthread_mutex_destroy(&self->packets_mutex);
         pthread_mutex_destroy(&self->open_mutex);
+        pthread_mutex_destroy(&self->close_mutex);
     }
 
@@ -3941,5 +3946,12 @@ static void producer_close(mlt_producer parent)
 
     // Close the parent
+    mlt_cache_item cache_item = mlt_service_cache_get(MLT_PRODUCER_SERVICE(parent),
+                                                      "producer_avformat");
+    producer_avformat self = mlt_cache_item_data(cache_item, NULL);
+    if (self)
+        pthread_mutex_lock(&self->close_mutex);
     parent->close = NULL;
+    if (self)
+        pthread_mutex_unlock(&self->close_mutex);
     mlt_producer_close(parent);
 

@bmatherly
Copy link
Member Author

Looks good. I think this strategy to synchronize it is better than reference counting.

Do you need to close the cache item?

For producer_close(), I would suggest:

// Disconnect the producer_avformat instance
mlt_cache_item cache_item = mlt_service_cache_get(MLT_PRODUCER_SERVICE(parent),
                                                "producer_avformat");
producer_avformat self = mlt_cache_item_data(cache_item, NULL);
if (self) {
    pthread_mutex_lock(&self->close_mutex);
    self->parent = NULL;
    pthread_mutex_unlock(&self->close_mutex);
}
mlt_cache_item_close(cache_item);

// Close the parent
parent->close = NULL;
mlt_producer_close(parent);

@ddennedy
Copy link
Member

OK, so you are using ->parent = NULL to signal to producer_avformat_close() to skip mlt_events_disconnect(). So, I modify producer_avformat_close() accordingly:

    pthread_mutex_lock(&self->close_mutex);
    if (self->parent)
        mlt_events_disconnect(MLT_PRODUCER_PROPERTIES(self->parent), self);
    pthread_mutex_unlock(&self->close_mutex);

It no longer fixes the crashing/deadlocking! Only also checking self->parent->close makes it stable, but your version does not protect it. Weird. I will commit something that combines all this.

@ddennedy ddennedy closed this Sep 12, 2023
@bmatherly
Copy link
Member Author

It no longer fixes the crashing/deadlocking! Only also checking self->parent->close makes it stable, but your version does not protect it. Weird. I will commit something that combines all this.

Unexpected! I guess we do not fully understand the interactions leading up to the problem.

I agree - combine the best of all of them an hope for the best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants