Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Live Publication in Archive on Capture Agent Change #4316

Merged
merged 3 commits into from Nov 9, 2022

Conversation

KatrinIhler
Copy link
Member

@KatrinIhler KatrinIhler commented Oct 13, 2022

Before I begin, I would like to apologize to whoever will review this, because you're now going to have to deal with this extremely complicated and messy code.

The problem this PR is trying to fix is the following: If someone changes the capture agent for a scheduled live event, the URLs of the live tracks (which usually contain the capture agent name) are updated for search, but not in the archive. This means that the External API still serves the old information, and the live stream cannot be played. Additionally, the URLs in the search service are only up-to-date temporarily, but can revert back to their old state on the next snapshot.

The reasons for this are two-fold:

  1. The live schedule service reads the capture agent name from the episode Dublin Core spatial field, called location in the metadata. (This is IMO very questionable, because we might put other stuff in that metadata field aside from the capture agent id, and would not expect this to impact the live publication, but changing this was out of scope for this PR. One thing at a time...)
    When changing the capture agent, the scheduler service actually writes the CA id into that metadata field. This information is then put into the Elasticsearch index and handed over to the live schedule service. It is not, however, actually archived. Which means that on the next snapshot (unless we change metadata in the Admin UI, which will send all fields to the back-end on a change, including the location info from the index), that value is overwritten again.
  2. The second problem is that the live schedule service updates the live tracks for search on a capture agent change, but doesn't actually update the live publication that the asset manager knows about.

So basically what I did here in both cases is to actually tell the asset manager what the hell is going on. Because I had a lot of problems understanding the code and was concerned that I could unintentionally break something, this also contains some refactoring. This is not supposed to change behavior, but simply make things clearer. I'm also fairly certain that there are more bugs in the affected code and that it could be simplified more, but this was out of scope for this work. I hope to be able to revisit this in the future, but for now, this will hopefully make things a little bit better.

Edit: I would also like to add that I still don't completely understand why the code does the things it does, so please don't ask me.

The scheduler service updates this field in the index when the capture agent is
changed, but doesn't actually make a snapshot, so this change is
overwritten when the next snapshot is made.
This is also relevant because this field is used in the live schedule
service to generate the URLs for the live tracks.
...for easier understanding.
Before, any changes to the live tracks were only published to
search, but not archived. This would result in the External API serving
information that's no longer up-to-date.
@KatrinIhler KatrinIhler added bug ELAN Pull requests originating from ELAN e.V. java Pull requests that update Java code labels Oct 13, 2022
@gregorydlogan
Copy link
Member

I'm also fairly certain that there are more bugs in the affected code and that it could be simplified more[...]

Did we get bugs in our code, or code in our bugs?

Copy link
Member

@gregorydlogan gregorydlogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this looks right to me, but without a way to properly test with a streaming-capable CA it's hard to tell. Is there an easy way to set up a fake streaming CA that you're aware of?

@KatrinIhler
Copy link
Member Author

KatrinIhler commented Oct 27, 2022

So this looks right to me, but without a way to properly test with a streaming-capable CA it's hard to tell. Is there an easy way to set up a fake streaming CA that you're aware of?

Technically, you don't need to have a capture agent at all to test this. It's enough if Opencast thinks there's a capture agent. I tested this by first configuring live streaming (add publishLive parameter to a workflow and configure some stuff in the LiveScheduleService.cfg) and then registering three fake capture agents via the /capture-admin rest endpoint.

The changes in this PR only affect the live publication in Opencast, so you can just check if it contains the right URLs for its tracks before/after changing the capture agent by asking the asset manager and the search service respectively. No streaming necessary. :)

Edit:
But if you want a real live stream, I think I have a pyCA config lying around that does that, just ask. ;)

@karendolan
Copy link
Member

karendolan commented Oct 27, 2022

@KatrinIhler back in previous versions of Opencast, a change to schedule/event metadata would make an archive snapshot prior to publishing the update. Is this not the case anymore with scheduled events? [EDIT] I'm not sure if I'm right, but before just the mediapackage top level and catalog metadata is updated when a live scheduled event updates. And now the updated live publication element in the archived mediapackage is also updated.


// update live publication in archive
MediaPackage updatedArchivedMp = updateLivePublication(snapshot.getMediaPackage(), liveTracks);
snapshotVersionCache.put(mpId, assetManager.takeSnapshot(updatedArchivedMp).getVersion());
return true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KatrinIhler So, this new line 364 and 365 is the crux of the patch to update the publication element in the archive when the live publication track changes. Previously, just the mediapackage top level metadata and catalog is updated but not the publication element, and this updates the publication element as well. Does that sound right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

@KatrinIhler
Copy link
Member Author

@KatrinIhler back in previous versions of Opencast, a change to schedule/event metadata would make an archive snapshot prior to publishing the update. Is this not the case anymore with scheduled events? [EDIT] I'm not sure if I'm right, but before just the mediapackage top level and catalog metadata is updated when a live scheduled event updates. And now the updated live publication element in the archived mediapackage is also updated.

That snapshot is still happening. But the live schedule service also needs to take a snapshot when changing the live publication, and the scheduler service forgot to include some information when taking its snapshot.
The data in the search service was always updated successfully.

@gregorydlogan
Copy link
Member

I have (finally) gotten a test harness working for this, and am confident that it does what it says on the tin. Happy to merge if @karendolan doesn't have any objections?

@gregorydlogan gregorydlogan self-assigned this Nov 9, 2022
@gregorydlogan gregorydlogan merged commit cc98381 into opencast:r/11.x Nov 9, 2022
@lkiesow
Copy link
Member

lkiesow commented Nov 16, 2022

Important: This is merged into r/11.x but reverted as part of the merge back to r/12.x.
Can you please submit a new patch for 12?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ELAN Pull requests originating from ELAN e.V. java Pull requests that update Java code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants