Skip to content

Conversation

@nilleb
Copy link

@nilleb nilleb commented Sep 25, 2025

Description

Once the system transcription was working, I attempted to implement a feature I needed: copying the transcription text.
I have also enabled the transcript timestamps (supported by WhisperKit).

The format of the transcript for the moment looks like

<|startoftranscript|><|en|><|transcribe|><|0.00|> (applause)<|2.00|> <|2.00|> of an unprecedented fix.<|4.00|> <|4.00|> After almost 23 years on the air,<|7.00|> <|7.00|> we're suddenly not being broadcast in 20% of the country,<|10.00|> <|10.00|> which is not a situation we relish.<|12.00|> <|12.00|> So we reached out to the chairman of the FCC, Brendan Karn.<|17.00|> <|17.00|> He has, to his credit, agreed to join us<|19.00|> <|19.00|> from his office in Washington and Washington, D.C.<|23.00|> <|23.00|> to be the president of the FCC.<|25.00|> <|25.00|> And he has, to his credit, agreed to join us<|28.00|> <|startoftranscript|><|en|><|transcribe|><|0.00|> office in Washington and here he is now.<|2.96|> <|2.96|> Thank you Chairman Carr for being with us tonight.<|4.80|> <|startoftranscript|><|en|><|transcribe|><|0.00|> (audience cheering)<|0.76|><|endoftext|>

[User Audio Note: The following was spoken by the user during this recording. Please incorporate this context when creating the meeting summary:]

<|startoftranscript|><|en|><|transcribe|><|0.00|> Let's watch a video together.<|7.00|><|endoftext|> <|startoftranscript|><|en|><|transcribe|><|0.00|> here is now thank you chairman corn for being with us tonight<|5.20|> <|5.20|> so what is the name of the chairman of the fcc for the next time<|11.12|> <|11.12|> please add this as an action item<|15.84|> <|startoftranscript|><|en|><|transcribe|><|0.00|> [ Silence ]<|3.98|><|endoftext|>

[End of User Audio Note. Please align the above user input with the meeting content for a comprehensive summary.]

I plan to change the format of the transcript text soon to match something like

0s: Microphone: let's watch a video together
0s: System Audio: (applause)
[..]

This PR also contains the icon for dark mode.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • Unit tests pass locally
  • New tests have been added for new functionality
  • Existing tests have been updated if needed

Checklist

  • I have performed a self-review of my own code
  • I have commented my code where necessary (following the no-comments rule)
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

You must resolve #12 before merging this pull request. This code is hereby granted only if this software is licensed with an open-source license.

@rawandahmad698
Copy link
Collaborator

Will review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements two major features: a copy transcript functionality for copying transcription text to clipboard and system-wide audio recording capabilities. It also includes foundational work for timestamped transcription support and various UI improvements.

  • Adds copy transcription button alongside the existing copy summary functionality
  • Implements system-wide audio recording via SystemWideTap for capturing all system audio rather than specific applications
  • Introduces timestamped transcription models and utilities with WhisperKit integration

Reviewed Changes

Copilot reviewed 44 out of 51 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
cli New build script for managing Xcode operations
SummaryView.swift Adds copy transcription button to UI
SummaryViewModel.swift Implements copyTranscription() method
TranscriptionService.swift Enhanced with timestamped transcription support
SystemWideTap.swift New system-wide audio capture implementation
AudioRecordingCoordinator.swift Updated to support both process-specific and system-wide recording
RecordingInfo.swift Added timestampedTranscription field
SelectableApp.swift Added "All Apps" option for system-wide recording
Assets.xcassets Added dark mode icon assets

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

do {
let recording = try self.fetchRecordingEntity(id: id, context: context)

// Encode the timestamped transcription to binary data
Copy link

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This comment is unnecessary as the code clearly shows JSON encoding. Consider removing this comment as it doesn't add value beyond what the code expresses.

Suggested change
// Encode the timestamped transcription to binary data

Copilot uses AI. Check for mistakes.

private func loadModel(_ modelName: String, isDownloaded: Bool) async throws {
do {
print("Loading WhisperKit model: \(modelName), isDownloaded: \(isDownloaded)")
Copy link

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using print() statements for logging in production code is not recommended. These should use the logger instance that's already available in the class for consistent logging behavior.

Copilot uses AI. Check for mistakes.
}
)

print("WhisperKit model loaded successfully: \(modelName)")
Copy link

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using print() statements for logging in production code is not recommended. These should use the logger instance that's already available in the class for consistent logging behavior.

Copilot uses AI. Check for mistakes.
try await whisperModelRepository.markAsDownloaded(name: modelName, sizeInMB: nil)
let modelInfo = await WhisperKit.getModelSizeInfo(for: modelName)
try await whisperModelRepository.markAsDownloaded(name: modelName, sizeInMB: Int64(modelInfo.totalSizeMB))
print("Model marked as downloaded: \(modelName), size: \(modelInfo.totalSizeMB) MB")
Copy link

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using print() statements for logging in production code is not recommended. These should use the logger instance that's already available in the class for consistent logging behavior.

Copilot uses AI. Check for mistakes.

} catch {
throw TranscriptionError.modelLoadingFailed(error.localizedDescription)
print("Failed to load WhisperKit model \(modelName): \(error)")
Copy link

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using print() statements for logging in production code is not recommended. These should use the logger instance that's already available in the class for consistent logging behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +60
await MainActor.run {
systemWideTap.activate()
}
Copy link

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The systemWideTap.activate() call is redundant here since it's already called earlier in the start() method. This could cause issues or unnecessary work.

Suggested change
await MainActor.run {
systemWideTap.activate()
}

Copilot uses AI. Check for mistakes.
Comment on lines +5 to 6
<key>com.apple.security.temporary-exception.audio-unit-host</key>
<true/>
Copy link

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using temporary security exceptions in production should be avoided. This entitlement bypasses sandbox restrictions and may not be acceptable for App Store distribution. Consider implementing proper audio unit hosting within the sandbox.

Suggested change
<key>com.apple.security.temporary-exception.audio-unit-host</key>
<true/>

Copilot uses AI. Check for mistakes.
@nilleb
Copy link
Author

nilleb commented Oct 1, 2025

Closing - Maybe I will re-propose when the project switches to MIT. Sorry for the annoyance.

@nilleb nilleb closed this Oct 1, 2025
@rawandahmad698
Copy link
Collaborator

Closing - Maybe I will re-propose when the project switches to MIT. Sorry for the annoyance.

The project is MIT Bro. Check it out

@nilleb nilleb reopened this Oct 1, 2025
@nilleb
Copy link
Author

nilleb commented Oct 1, 2025

I am deeply grateful, Rawa.

But let me rework this a bit because I don't think it's up to quality standards. ^^ (I have learnt a lot about MacOS/Swift during the last week, beyond what my intelligent unanimated friends did for me)

@nilleb nilleb force-pushed the feature/copy-transcript branch from 7a7cfe0 to 1ed4753 Compare October 3, 2025 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"Retry Summarization" triggers an entirely new transcription prior to an attempted re-summarization

4 participants