Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export to Fusion Text+ #14

Closed
Dschogo opened this issue Nov 6, 2022 · 20 comments
Closed

Export to Fusion Text+ #14

Dschogo opened this issue Nov 6, 2022 · 20 comments
Labels
enhancement New feature or request

Comments

@Dschogo
Copy link

Dschogo commented Nov 6, 2022

Hey,

first of all awesome work, and huge thanks for making this Open Source. I got two questions:

  1. Is it possible to export/import the transcript as text+ objects? (creating via the api)
    Why? Text+ has way more features to stylize text (only for burn in ofc)
    How? My idea would be, to create a new video track (to avoid overwriting existing stuff), and create all text+ objects there

  2. I've skimmed over the code, but couldn't really find where to set/rewrite a bit code to set the length of each "line"/segment. Will there be an option in the future for this or a similar function?

I'll try to dig a bit deeper and maybe find a solution, but maybe its already in the works, which would be awesome

@octimot
Copy link
Owner

octimot commented Nov 7, 2022

Hey there!

Thanks for the feedback!

1. Text+

Is it possible to export/import the transcript as text+ objects? (creating via the api)
Why? Text+ has way more features to stylize text (only for burn in ofc)
How? My idea would be, to create a new video track (to avoid overwriting existing stuff), and create all text+ objects there

Adding the transcript segments as Text+ in the timeline might be possible, but only if the Resolve API allows the creation of Fusion Clips and then importing them into the timeline

The way I would approach it would be to

  1. First create the .comp file(s) - that requires understanding how Fusion .comp files work and how you can create a Text node and all the nodes that you need
  2. Then, I would try to import it into the timeline using the Resolve API - maybe AddFusionComp or ImportFusionComp:
TimelineItem

[...]

  AddFusionComp()                                 --> fusionComp         # Adds a new Fusion composition associated with the timeline item.
  ImportFusionComp(path)                          --> fusionComp         # Imports a Fusion composition from given file path by creating and adding a new composition for the item.

As you can probably tell this involves a lot of work, but feel free to tackle it if you feel brave and I'll try to help on the way. :-D

2. Subtitles length

I've skimmed over the code, but couldn't really find where to set/rewrite a bit code to set the length of each "line"/segment. Will there be an option in the future for this or a similar function?

This is done using AI/Whisper because a bit of semantics/context is required so that the subtitles make sense when you create them, otherwise you'd have phrases split in a way that's not readable by the viewer. I think the SRT function is somewhere in the utils.py file.

I've seen some discussions on the topic of subtitle length on their Github page: https://github.com/openai/whisper/discussions?discussions_q=subtitles+length

EDIT: This is partially available in the latest updates - see #42

Cheers!

@Dschogo
Copy link
Author

Dschogo commented Nov 7, 2022

Thanks for the quick and detailed response.

I've already made some progress in regards of inserting Text+ elements directly into the timline.

(using the same initialize_resolve as in mots_resolve)

[resolve, project, mediaPool, projectManager, currentBin, currentTimeline] = initialize_resolve()

textobj = currentTimeline.InsertFusionTitleIntoTimeline("Text+")
toolList = textobj.GetFusionCompByIndex(1).GetToolList()
toolList[1].StyledText = "something"
toolList[1].Font, toolList[1].Style = "Bell MT", "Bold"

This creates a text+ object with the default length, text is set to "something" and as you can see, other properties are also exposed and configurable.

Since the Title is always placed at the current playhead/timecode, im searching for a way to move the titles to the correct timecodes, as well as setting their duration. (to avoid always moving the playhead/timecode around)

@octimot
Copy link
Owner

octimot commented Nov 7, 2022

This creates a text+ object with the default length, text is set to "something" and as you can see, other properties are also exposed and configurable.

That's cool!

You can move the playhead using

mots_resolve.set_resolve_tc(new_tc)

In principle, something like this could work (writing from my head):

import app

toolkit_ops_obj = ToolkitOps()

transcription_file_path = 'INSERT PATH TO transcription.json FILE'

transcription_data = toolkit_ops_obj.get_transcription_file_data(transcription_file_path=transcription_file_path)

# take each segment
for segment in transcription_data['segments']
  
  # read the start time in seconds of the segment
  start_time = segment['start'] 
  end_time = segment['end']

  # convert to resolve timecode
  start_tc = toolkit_ops_obj.calculate_sec_to_resolve_timecode(start_time)

  # .... more stuff, like calculate duration etc.

  # move playhead
  mots_resolve.set_resolve_tc(start_tc)

  # insert text+ using your function

@Dschogo
Copy link
Author

Dschogo commented Nov 8, 2022

Sadly there seems to be no way of setting a duration right now. The default duration of 5 seconds is sometimes enough (maybe if the line length could be adjusted? - the proposed edits in the openai-whispher didn't worked for me).

One option would be to find the original Text+.settings file and set there 60seconds or similar. That works in theory, because the next insertion would just cut of the end of the previous text. But I didn't manage to find it and also creating a custom text preset, seems to refuse to work, since the Api only allows internal Presets as far as I could test.

Your proposed code works like a charm (except the mots_resolve.set_resolve_tc wants the timecode as string)

Apart from editing the source Text+ file I think its pretty much impossible with the current Api. (In terms of Easy of use).
It's still a nice addition to get the rough shape into the timeline, (holes and too long lines, but great start to manually edit it)

@octimot
Copy link
Owner

octimot commented Nov 9, 2022

Sadly there seems to be no way of setting a duration right now. The default duration of 5 seconds is sometimes enough (maybe if the line length could be adjusted? - the proposed edits in the openai-whispher didn't worked for me).

These are difficult to manipulate since the Whisper models are using their training data to decide the length of each segment, and you'd probably need to go deep into the neural net to make it happen. I'm pretty sure you can achieve that by fine-tuning the models, but before we get to that functionality in the tool, there are a lot more priorities to fix.

One option would be to find the original Text+.settings file and set there 60seconds or similar. That works in theory, because the next insertion would just cut of the end of the previous text. But I didn't manage to find it and also creating a custom text preset, seems to refuse to work, since the Api only allows internal Presets as far as I could test.

Maybe exporting the Text+ comp and modifying it or using a custom Fusion comp template, then importing it in the timeline is a better way. I think the FusionScript API is more flexible then the Resolve API and would allow you to add nodes, configure them etc.

Your proposed code works like a charm (except the mots_resolve.set_resolve_tc wants the timecode as string)

Simply convert the timecode to string when calling the function mots_resolve.set_resolve_tc(str(new_timecode)) and it should work.

@octimot octimot added the enhancement New feature or request label Nov 9, 2022
@yokhalel
Copy link

yokhalel commented Nov 22, 2022

I would love it if you could implement Text+ as an option next to .srt insert. For me .srt is just a middle step to complete my projects as i only use the .srt for timecoding Text+ manually, which is a real timehog.

Blackmagic should really overhaul the subtitle stylisation, you can't even put an outside stroke onto the text, the only option for stroke is inner stroke, which looks ugly af

@octimot
Copy link
Owner

octimot commented Nov 22, 2022

@yokhalel I think it would be a cool feature to have, but it does take a lot of work and we need to finish some other stuff before...

Off topic, but maybe useful for your workflow - if you select your entire transcript using CMD/CTRL+A, and then do CMD/CTRL+Shift+C, you will copy all your transcript segments to clipboard together with their timecodes, which you can then paste wherever you need them. You could probably use this to generate the Text+ comp faster and easier.

@yokhalel
Copy link

@octimot thank you for your response. Can you elaborate a bit more on how to genererate these Text+ comps with the copied transcript and timecode?

@octimot
Copy link
Owner

octimot commented Dec 1, 2022

@yokhalel Sorry for the late reply. I don't really have a process in mind right now, but...

A Fusion Text+ element looks like this if you copy it from Fusion and paste it into a text editor:

{
	Tools = ordered() {
		Text1 = TextPlus {
			CtrlWZoom = false,
			Inputs = {
				GlobalOut = Input { Value = 119, },
				Width = Input { Value = 1920, },
				Height = Input { Value = 1080, },
				UseFrameFormatSettings = Input { Value = 1, },
				["Gamut.SLogVersion"] = Input { Value = FuID { "SLog2" }, },
				LayoutRotation = Input { Value = 1, },
				TransformRotation = Input { Value = 1, },
				Softness1 = Input { Value = 1, },
				StyledText = Input {
					SourceOp = "Text1StyledText",
					Source = "Value",
				},
				Font = Input { Value = "Open Sans", },
				Style = Input { Value = "Bold", },
				VerticalJustificationNew = Input { Value = 3, },
				HorizontalJustificationNew = Input { Value = 3, },
			},
			ViewInfo = OperatorInfo { Pos = { 252.667, 3.72727 } },
		},
		Text1StyledText = BezierSpline {
			SplineColor = { Red = 237, Green = 142, Blue = 243 },
			CtrlWZoom = false,
			NameSet = true,
			KeyFrames = {
				[0] = { 0, RH = { 3.33333333333333, 0.333333333333333 }, Flags = { Linear = true, LockedY = true }, Value = Text {
						Value = "Text 1"
					} },
				[10] = { 1, LH = { 6.66666666666667, 0.666666666666667 }, Flags = { Linear = true, LockedY = true }, Value = Text {
						Value = "Text 2"
					} }
			}
		}
	},
	ActiveTool = "Text1"
}

As you see, the attribute KeyFrames under Text1StyledText contains all the text on each keyframe:

KeyFrame [0], the value is "Text 1"
Keyframe [1], the value is "Text 2"

With a bit of thinkering, you could use the timecode to add all the needed keyframes in the Text+ element, but this requires a bit of coding skills (or just manually entering them after you convert the timecode to keyframes).

tl;dr

I know this is not a proper solution and would love to code this one day, but we have a list of stuff that want to get done sooner and maybe it helps you somehow until then...

Maybe @Dschogo is also interested in working this out?

@octimot octimot changed the title Insert output as Text+ | set max char length of segment Export to Fusion Text+ Apr 28, 2023
@octimot
Copy link
Owner

octimot commented May 3, 2023

@Dschogo @yokhalel @boodybon

Starting from version 0.18.3, you can export the transcription lines into a Fusion Text node.

  1. Open transcript
  2. File -> Export as Fusion text...
  3. Save .comp file on your drive
  4. Open .comp file with notepad/text edit
  5. Select all and copy all contents to clipboard
  6. Open Resolve and/or Fusion (or Fusion page)
  7. CMD/CTRL+V to paste Fusion nodes into composition
  8. Then connect the new nodes wherever you want in your composition and modify the text styling

You can also import the .comp file directly into Fusion, but it will probably remove all your existing nodes from the comp.

Feel free to re-open this issue in case something doesn't work as expected.

Cheers!

@octimot octimot closed this as completed May 3, 2023
@yokhalel
Copy link

yokhalel commented May 11, 2023

@octimot thank you for implementing this! Unfortunatly it doesn't work for me, i'm gettings these error messages and nothing gets exported:

grafik
app.log

I'm on latest Alpha 0.18.3 btw. and i can't see a Text+ Export button, as stated on patreon, is it missing or a i blind?

@octimot
Copy link
Owner

octimot commented May 11, 2023

Hey!

The Export to Fusion text... button should be in the File menu of the main window.

You need to enter the timecode data (start timecode - probably 01:00:00:00 if you're using Resolve; and framerate according to your timeline), otherwise the tool cannot know how to arrange the frames in the Fusion node.

Cheers!

@yokhalel
Copy link

@octimot where do i put in the timecode and framerate?

@octimot
Copy link
Owner

octimot commented May 11, 2023

A window should pop-up after you press the Export to Fusion text... button.

If you transcribed a Resolve timeline, you shouldn't have to do it, because the transcription.json file should already have it stored.

Did you find the menu button?

@yokhalel
Copy link

Indeed i transcribed a Resolve timeline. Yes i was already using the right button, unfortunatly the only window, which pops up is to choose file location. When i press save, nothing happens and the folder stays empty

@octimot
Copy link
Owner

octimot commented May 11, 2023

This is weird!

This probably means that the Resolve API didn't send the timecode or framerate in the right format... What is the timeline start timecode and framerate?

Also, if you don't have anything sensitive in the transcription, would you mind attaching the IT01 1683822582.wav.transcription.json file and the IT01 1683822582.wav.json files - assuming this is the transcription that doesn't want to export - just to see whether Resolve passed the timecode data or not.

@yokhalel
Copy link

Timeline code is 01:00:00:00 and Framerate is 30 fps.
There you go :)
Leaf KW19 Michelle.zip

@octimot
Copy link
Owner

octimot commented May 12, 2023

@yokhalel

It's a bug, thanks for reporting!

If the first segment starts at frame 0 the tool doesn't assign a proper timecode to that segment for export. I'll code a fix and push it asap!

To prevent this from happening until I push the fix in the standalone, just make sure the first segment in your transcription doesn't start at frame 0. If it does, select that segment in the tool, then go to Integrations->Align segment start to playhead (or key :), move the playhead in resolve to frame 1 of the timeline and press ok in the tool.

Here's the exported comp file:
IT01 1683822582.wav.comp.zip

Just open in notepad/text edit, copy all and paste it into Fusion, then connect the merge node.

Let me know if it works!

@yokhalel
Copy link

Yes this did the trick, thank you!

Just two more Questions:

  1. Is it possible to get two lines per segment?
  2. As the .comp file contains all the lines, how can i correct the timing of the subtitles, besides keyframing them in Fusion? There is no sound on the Fusion page

@octimot
Copy link
Owner

octimot commented May 12, 2023

Yes this did the trick, thank you!

Cool! The fix is already up on the private repo, if you want to use the github version! We need a bit of time to wrap it into a new standalone...

  1. Is it possible to get two lines per segment?

Unfortunately, not from the tool without a lot of coding. I think there might be a way via Fusion scripting, but I can't help you much there...

  1. As the .comp file contains all the lines, how can i correct the timing of the subtitles, besides keyframing them in Fusion? There is no sound on the Fusion page

First, I recommend using an Adjustment Layer instead of a Fusion Composition - just right click on the Adjustment Layer and open that in Fusion. This way, you can hear the sound from the timeline in the Fusion page.
Second, you can move the keyframes around if you open the Keyframes panel, and then go to TranscriptText -> StyledText.

Another option directly from the tool: before export, you can use the Align segment start (or end) to playhead function in the Integrations menu.

You can also edit the .comp file - if you look closesly, all the transcript segments are next to their keyframe value:

KeyFrames = {
    [0] = { Value = Text { Value = " Wenn ihr diese" } },
    [27] = { Value = Text { Value = " schöne, reine Haut" } },
   ...

Above, 0 and 27 are the frame numbers in your Fusion comp. You can totally change these before pasting them into Fusion.

BTW, I noticed that you have some gaps in your transcription. You should consider using the Prevent Gaps Shorter Than value when you transcribe, and maybe set it to 0.3 or something similar. This way, if you have transcript segments that are very close to each other, the previous one will extend to reach the next one so that you don't have a few frames without text on the final video.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants