diff --git a/README.md b/README.md index dc8bf14..5178a7e 100644 --- a/README.md +++ b/README.md @@ -81,6 +81,7 @@ - [About the Project](#about-the-project) - [Supported Languages](#supported-languages) + - [Supported File Types](#supported-file-types) - [Project Structure](#project-structure) - [Built With](#built-with) - [Getting Started](#getting-started) @@ -90,6 +91,8 @@ - [Usage](#usage) - [Transcribe From](#transcribe-from) - [Save Transcription](#save-transcription) + - [Autosave](#autosave) + - [Overwrite Existing Files](#overwrite-existing-files) - [Transcribe Using](#transcribe-using) - [WhisperX Options](#whisperx-options) - [Transcription Translation](#transcription-translation) @@ -238,6 +241,49 @@ You can also choose the theme you like best. It can be dark, light, or the one c - Yoruba +### Supported File Types + +
+ Audio file formats + + - `.mp3` + - `.mpeg` + - `.wav` + - `.wma` + - `.aac` + - `.flac` + - `.ogg` + - `.oga` + - `.opus` +
+ +
+ Video file formats + + - `.mp4` + - `.m4a` + - `.m4v` + - `.f4v` + - `.f4a` + - `.m4b` + - `.m4r` + - `.f4b` + - `.mov` + - `.avi` + - `.webm` + - `.flv` + - `.mkv` + - `.3gp` + - `.3gp2` + - `.3g2` + - `.3gpp` + - `.3gpp2` + - `.ogv` + - `.ogx` + - `.wmv` + - `.asf` +
+ ### Project Structure @@ -399,7 +445,6 @@ You can also choose the theme you like best. It can be dark, light, or the one c source venv/Scripts/activate ``` 5. Run `cat requirements.txt | xargs -n 1 pip install` to install the dependencies. - > [!WARNING] >For some reason, `pip install -r requirements.txt` throws the error "Could not find a version that satisfies the requirement [PACKAGE_NAME]==[PACKAGE_VERSION] (from version: none)" 6. Run `python src/app.py` to start the program. @@ -433,7 +478,7 @@ Once you open the **Audiotext** executable file (explained in the [getting start ### Transcribe From -You can transcribe from three audio sources: +You can transcribe from four sources: - **File** (see image above): Click on the file explorer icon to select the file you want to transcribe. You can also manually enter the path to the file into the input field. You can transcribe audio from both audio and video files. Note that the file explorer has the `All supported files` option selected by default. To select only audio files or video files, click the combo box in the lower right corner of the file explorer to change the file type, as marked in red in the following image: @@ -441,46 +486,58 @@ You can transcribe from three audio sources: ![Supported files](docs/supported-files.png) -
- Supported audio file formats - - - `.mp3` - - `.mpeg` - - `.wav` - - `.wma` - - `.aac` - - `.flac` - - `.ogg` - - `.oga` - - `.opus` -
- -
- Supported video file formats - - - `.mp4` - - `.m4a` - - `.m4v` - - `.f4v` - - `.f4a` - - `.m4b` - - `.m4r` - - `.f4b` - - `.mov` - - `.avi` - - `.webm` - - `.flv` - - `.mkv` - - `.3gp` - - `.3gp2` - - `.3g2` - - `.3gpp` - - `.3gpp2` - - `.ogv` - - `.ogx` - - `.wmv` - - `.asf` -
+- **Directory**: Click on the file explorer icon to select the directory with the files you want to transcribe. You can also manually enter the path to the directory into the input field. All supported video and audio files from the root of the directory and its subdirectories will be transcribed. Note that the `Autosave` option is checked and cannot be unchecked because each file's transcription will automatically be saved in the same path as the source file. + + + + + Main + + +For example, let's use this directory as a reference: + +``` +└───files-to-transcribe + │ paranoid-android.mp3 + │ the-past-recedes.flac + │ + └───movies + seul-contre-tous.mp4 + mulholland-dr.avi +``` + +After transcribing the `files-to-transcribe` directory with subtitles, the folder structure will look like this: + +``` +└───files-to-transcribe + │ paranoid-android.mp3 + │ paranoid-android.srt + │ paranoid-android.txt + │ paranoid-android.vtt + │ the-past-recedes.flac + │ the-past-recedes.srt + │ the-past-recedes.txt + │ the-past-recedes.vtt + │ + └───movies + seul-contre-tous-1998.mp4 + seul-contre-tous-1998.srt + seul-contre-tous-1998.txt + seul-contre-tous-1998.vtt + mulholland-dr-2001.avi + mulholland-dr-2001.srt + mulholland-dr-2001.txt + mulholland-dr-2001.vtt +``` - **Microphone**: To start recording, simply click the `Start recording` button to begin the process. The text of the button will change to `Stop recording` and its color will change to red. Click it to stop recording and generate the transcription. @@ -491,8 +548,6 @@ You can transcribe from three audio sources: https://github.com/HenestrosaDev/audiotext/assets/60482743/bd0323d7-ff54-4363-8b73-a2d56e7f783b - >Video from v2.1.0 - - **YouTube video**: Enter the video URL in the upper input field. When finished, click on the `Generate transcription` button. @@ -512,9 +567,32 @@ You can transcribe from three audio sources: ### Save Transcription -Once the program has generated the transcription, you'll see a green `Save transcription` button below the text box. If you click on it, you'll be prompted for a file explorer where you can give the file a name and select the path where you want to save it. The file extension is `.txt` by default, but you can change it to any other text file type. +When you click on the `Save transcription` button, you'll be prompted for a file explorer where you can name the transcription file and select the path where you want to save it. The file extension is `.txt` by default, but you can change it to any other text file type. + +If you use **WhisperX** to generate a transcription and check the `Generate subtitles` option, two files will also be saved along with the text file: a `.vtt` file and a `.srt` file. Both contain the subtitles for the transcribed file, as explained in the [Generate Subtitles](#generate-subtitles) section. + +Please note that any text entered or modified in the textbox **WILL NOT** be included in the saved transcription. + +#### Autosave + +If checked, the transcription will automatically be saved in the root of the folder where the transcribed file is stored. If you check the `Generate subtitles` option, the subtitle files will also be saved automatically. If there are already existing files with the same name, they won't be overwritten. To do that, you'll need to check the `Overwrite existing files` option (see below). -If you used **WhisperX** to generate the transcription and checked the `Generate subtitles` option, you'll notice that two files are also saved along with the text file: a `.vtt` file and a `.srt` file. Both contain the subtitles for the transcribed file, as explained in the [Generate Subtitles](#generate-subtitles) section. +#### Overwrite Existing Files + +This option can only be checked if the `Autosave` option is checked. If `Overwrite existing files` is checked, existing transcriptions in the root directory of the file to be transcribed will be overwritten when saving. + +For example, let's use this directory as a reference: + +``` +└───audios + foo.mp3 + foo.srt + foo.txt +``` + +If we transcribe the audio file `foo.mp3` with the `Generate subtitles`, `Autosave` and `Overwrite existing files` options checked, the files `foo.srt` and `foo.txt` will be overwritten and the file `foo.vtt` will be created. + +On the other hand, if we transcribe the audio file `foo.mp3` with the options `Generate subtitles` and `Autosave` checked and the option `Overwrite existing files` unchecked, the file `foo.vtt` will still be created, but the files `foo.srt` and `foo.txt` will remain unchanged. ### Transcribe Using @@ -548,22 +626,16 @@ The **WhisperX** options appear when the selected transcription method is **Whis To translate the audio into English, simply check the `Translate to English` checkbox before generating the transcription, as shown in the video below. - + https://github.com/HenestrosaDev/audiotext/assets/60482743/0aeeaa17-432f-445c-b29a-d76839be489b -> [!NOTE] ->Video from v2.1.0 - However, there is another unofficial way to translate audio into any supported language by setting the `Audio language` to the target translation language. For example, if the audio is in English and you want to translate it into Spanish, you would set the `Audio language` to "Spanish". Here is a practical example using the microphone: - + https://github.com/HenestrosaDev/audiotext/assets/60482743/b346290f-4654-48c4-bf5a-2dcb75b136e9 -> [!NOTE] ->Video from v2.1.0 - Make sure to double-check the generated translations. #### Generate Subtitles @@ -725,9 +797,9 @@ Remember that **WhisperX** provides fast, unlimited audio transcription that sup - [x] Add subtitle options. - [x] Add advanced options for **WhisperX**. - [x] Add the option to transcribe YouTube videos. -- [ ] Add checkbox to automatically save the generated transcription [(#17)](https://github.com/HenestrosaDev/audiotext/issues/17). +- [x] Add checkbox to automatically save the generated transcription [(#17)](https://github.com/HenestrosaDev/audiotext/issues/17). +- [x] Allow transcription of multiple files from a directory. - [ ] Change the "Generate transcription" button to "Cancel transcription" when a transcription is in progress. -- [ ] Allow transcription of multiple files at once [(#17)](https://github.com/HenestrosaDev/audiotext/issues/17). - [ ] Generate executables for macOS and Linux. - [ ] Add pre-commit config for using `Black`, `isort`, and `mypy`. - [ ] Add tests. diff --git a/docs/dark/from-directory.png b/docs/dark/from-directory.png new file mode 100644 index 0000000..1c5781b Binary files /dev/null and b/docs/dark/from-directory.png differ diff --git a/docs/dark/from-file.png b/docs/dark/from-file.png index e8fa27e..2d97e21 100644 Binary files a/docs/dark/from-file.png and b/docs/dark/from-file.png differ diff --git a/docs/dark/from-microphone.png b/docs/dark/from-microphone.png index e4d4fe5..3cbc683 100644 Binary files a/docs/dark/from-microphone.png and b/docs/dark/from-microphone.png differ diff --git a/docs/dark/from-youtube.png b/docs/dark/from-youtube.png index 41169e9..0a5a75e 100644 Binary files a/docs/dark/from-youtube.png and b/docs/dark/from-youtube.png differ diff --git a/docs/dark/main.png b/docs/dark/main.png index e5b6ffd..608e204 100644 Binary files a/docs/dark/main.png and b/docs/dark/main.png differ diff --git a/docs/light/from-directory.png b/docs/light/from-directory.png new file mode 100644 index 0000000..8ee7e42 Binary files /dev/null and b/docs/light/from-directory.png differ diff --git a/docs/light/from-file.png b/docs/light/from-file.png index ea161dd..75a615d 100644 Binary files a/docs/light/from-file.png and b/docs/light/from-file.png differ diff --git a/docs/light/from-microphone.png b/docs/light/from-microphone.png index 857795a..44d68be 100644 Binary files a/docs/light/from-microphone.png and b/docs/light/from-microphone.png differ diff --git a/docs/light/from-youtube.png b/docs/light/from-youtube.png index d9a1198..8e45bf9 100644 Binary files a/docs/light/from-youtube.png and b/docs/light/from-youtube.png differ diff --git a/docs/light/main.png b/docs/light/main.png index 586b36b..784066a 100644 Binary files a/docs/light/main.png and b/docs/light/main.png differ diff --git a/docs/main-system.png b/docs/main-system.png index e8fa27e..ff1885d 100644 Binary files a/docs/main-system.png and b/docs/main-system.png differ diff --git a/docs/videos/english-to-spanish.mp4 b/docs/videos/english-to-french.mp4 similarity index 50% rename from docs/videos/english-to-spanish.mp4 rename to docs/videos/english-to-french.mp4 index 3d04a24..cce72dc 100644 Binary files a/docs/videos/english-to-spanish.mp4 and b/docs/videos/english-to-french.mp4 differ diff --git a/docs/videos/english.mp4 b/docs/videos/english.mp4 index f82ff76..c5dea67 100644 Binary files a/docs/videos/english.mp4 and b/docs/videos/english.mp4 differ diff --git a/docs/videos/french-to-english.mp4 b/docs/videos/french-to-english.mp4 deleted file mode 100644 index 2ba79e7..0000000 Binary files a/docs/videos/french-to-english.mp4 and /dev/null differ diff --git a/docs/videos/spanish-to-english.mp4 b/docs/videos/spanish-to-english.mp4 new file mode 100644 index 0000000..51f8905 Binary files /dev/null and b/docs/videos/spanish-to-english.mp4 differ diff --git a/src/views/main_window.py b/src/views/main_window.py index 19204ec..e77a148 100644 --- a/src/views/main_window.py +++ b/src/views/main_window.py @@ -753,6 +753,7 @@ def _on_transcribe_from_change(self, option: str): if self._transcribe_from_source != AudioSource.DIRECTORY: self.chk_autosave.configure(state=ctk.NORMAL) + self.btn_save.configure(state=ctk.NORMAL) if self._transcribe_from_source in [AudioSource.FILE, AudioSource.DIRECTORY]: self.btn_main_action.configure(text="Generate transcription") @@ -764,6 +765,7 @@ def _on_transcribe_from_change(self, option: str): self.chk_autosave.select() self.chk_autosave.configure(state=ctk.DISABLED) self.chk_overwrite_files.configure(state=ctk.NORMAL) + self.btn_save.configure(state=ctk.DISABLED) elif self._transcribe_from_source == AudioSource.MIC: self.btn_main_action.configure(text="Start recording")