Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs for version 2.2.2 #23

Merged
merged 4 commits into from
Jun 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
184 changes: 128 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@

- [About the Project](#about-the-project)
- [Supported Languages](#supported-languages)
- [Supported File Types](#supported-file-types)
- [Project Structure](#project-structure)
- [Built With](#built-with)
- [Getting Started](#getting-started)
Expand All @@ -90,6 +91,8 @@
- [Usage](#usage)
- [Transcribe From](#transcribe-from)
- [Save Transcription](#save-transcription)
- [Autosave](#autosave)
- [Overwrite Existing Files](#overwrite-existing-files)
- [Transcribe Using](#transcribe-using)
- [WhisperX Options](#whisperx-options)
- [Transcription Translation](#transcription-translation)
Expand Down Expand Up @@ -238,6 +241,49 @@ You can also choose the theme you like best. It can be dark, light, or the one c
- Yoruba
</details>

### Supported File Types

<details>
<summary>Audio file formats</summary>

- `.mp3`
- `.mpeg`
- `.wav`
- `.wma`
- `.aac`
- `.flac`
- `.ogg`
- `.oga`
- `.opus`
</details>

<details>
<summary>Video file formats</summary>

- `.mp4`
- `.m4a`
- `.m4v`
- `.f4v`
- `.f4a`
- `.m4b`
- `.m4r`
- `.f4b`
- `.mov`
- `.avi`
- `.webm`
- `.flv`
- `.mkv`
- `.3gp`
- `.3gp2`
- `.3g2`
- `.3gpp`
- `.3gpp2`
- `.ogv`
- `.ogx`
- `.wmv`
- `.asf`
</details>

<!-- PROJECT STRUCTURE -->

### Project Structure
Expand Down Expand Up @@ -399,7 +445,6 @@ You can also choose the theme you like best. It can be dark, light, or the one c
source venv/Scripts/activate
```
5. Run `cat requirements.txt | xargs -n 1 pip install` to install the dependencies.
> [!WARNING]
>For some reason, `pip install -r requirements.txt` throws the error "Could not find a version that satisfies the requirement [PACKAGE_NAME]==[PACKAGE_VERSION] (from version: none)"
6. Run `python src/app.py` to start the program.

Expand Down Expand Up @@ -433,54 +478,66 @@ Once you open the **Audiotext** executable file (explained in the [getting start

### Transcribe From

You can transcribe from three audio sources:
You can transcribe from four sources:

- **File** (see image above): Click on the file explorer icon to select the file you want to transcribe. You can also manually enter the path to the file into the input field. You can transcribe audio from both audio and video files. Note that the file explorer has the `All supported files` option selected by default. To select only audio files or video files, click the combo box in the lower right corner of the file explorer to change the file type, as marked in red in the following image:

![File explorer](docs/file-explorer.png)

![Supported files](docs/supported-files.png)

<details>
<summary>Supported audio file formats</summary>

- `.mp3`
- `.mpeg`
- `.wav`
- `.wma`
- `.aac`
- `.flac`
- `.ogg`
- `.oga`
- `.opus`
</details>

<details>
<summary>Supported video file formats</summary>

- `.mp4`
- `.m4a`
- `.m4v`
- `.f4v`
- `.f4a`
- `.m4b`
- `.m4r`
- `.f4b`
- `.mov`
- `.avi`
- `.webm`
- `.flv`
- `.mkv`
- `.3gp`
- `.3gp2`
- `.3g2`
- `.3gpp`
- `.3gpp2`
- `.ogv`
- `.ogx`
- `.wmv`
- `.asf`
</details>
- **Directory**: Click on the file explorer icon to select the directory with the files you want to transcribe. You can also manually enter the path to the directory into the input field. All supported video and audio files from the root of the directory and its subdirectories will be transcribed. Note that the `Autosave` option is checked and cannot be unchecked because each file's transcription will automatically be saved in the same path as the source file.

<picture>
<source
srcset="docs/light/from-directory.png"
media="(prefers-color-scheme: light)"
/>
<source
srcset="docs/dark/from-directory.png"
media="(prefers-color-scheme: dark)"
/>
<img
src="docs/light/from-directory.png"
alt="Main"
>
</picture>

For example, let's use this directory as a reference:

```
└───files-to-transcribe
│ paranoid-android.mp3
│ the-past-recedes.flac
└───movies
seul-contre-tous.mp4
mulholland-dr.avi
```

After transcribing the `files-to-transcribe` directory with subtitles, the folder structure will look like this:

```
└───files-to-transcribe
│ paranoid-android.mp3
│ paranoid-android.srt
│ paranoid-android.txt
│ paranoid-android.vtt
│ the-past-recedes.flac
│ the-past-recedes.srt
│ the-past-recedes.txt
│ the-past-recedes.vtt
└───movies
seul-contre-tous-1998.mp4
seul-contre-tous-1998.srt
seul-contre-tous-1998.txt
seul-contre-tous-1998.vtt
mulholland-dr-2001.avi
mulholland-dr-2001.srt
mulholland-dr-2001.txt
mulholland-dr-2001.vtt
```

- **Microphone**: To start recording, simply click the `Start recording` button to begin the process. The text of the button will change to `Stop recording` and its color will change to red. Click it to stop recording and generate the transcription.

Expand All @@ -491,8 +548,6 @@ You can transcribe from three audio sources:
<!-- english.mp4 -->
https://github.com/HenestrosaDev/audiotext/assets/60482743/bd0323d7-ff54-4363-8b73-a2d56e7f783b

>Video from v2.1.0

- **YouTube video**: Enter the video URL in the upper input field. When finished, click on the `Generate transcription` button.

<picture>
Expand All @@ -512,9 +567,32 @@ You can transcribe from three audio sources:

### Save Transcription

Once the program has generated the transcription, you'll see a green `Save transcription` button below the text box. If you click on it, you'll be prompted for a file explorer where you can give the file a name and select the path where you want to save it. The file extension is `.txt` by default, but you can change it to any other text file type.
When you click on the `Save transcription` button, you'll be prompted for a file explorer where you can name the transcription file and select the path where you want to save it. The file extension is `.txt` by default, but you can change it to any other text file type.

If you use **WhisperX** to generate a transcription and check the `Generate subtitles` option, two files will also be saved along with the text file: a `.vtt` file and a `.srt` file. Both contain the subtitles for the transcribed file, as explained in the [Generate Subtitles](#generate-subtitles) section.

Please note that any text entered or modified in the textbox **WILL NOT** be included in the saved transcription.

#### Autosave

If checked, the transcription will automatically be saved in the root of the folder where the transcribed file is stored. If you check the `Generate subtitles` option, the subtitle files will also be saved automatically. If there are already existing files with the same name, they won't be overwritten. To do that, you'll need to check the `Overwrite existing files` option (see below).

If you used **WhisperX** to generate the transcription and checked the `Generate subtitles` option, you'll notice that two files are also saved along with the text file: a `.vtt` file and a `.srt` file. Both contain the subtitles for the transcribed file, as explained in the [Generate Subtitles](#generate-subtitles) section.
#### Overwrite Existing Files

This option can only be checked if the `Autosave` option is checked. If `Overwrite existing files` is checked, existing transcriptions in the root directory of the file to be transcribed will be overwritten when saving.

For example, let's use this directory as a reference:

```
└───audios
foo.mp3
foo.srt
foo.txt
```

If we transcribe the audio file `foo.mp3` with the `Generate subtitles`, `Autosave` and `Overwrite existing files` options checked, the files `foo.srt` and `foo.txt` will be overwritten and the file `foo.vtt` will be created.

On the other hand, if we transcribe the audio file `foo.mp3` with the options `Generate subtitles` and `Autosave` checked and the option `Overwrite existing files` unchecked, the file `foo.vtt` will still be created, but the files `foo.srt` and `foo.txt` will remain unchanged.

### Transcribe Using

Expand Down Expand Up @@ -548,22 +626,16 @@ The **WhisperX** options appear when the selected transcription method is **Whis

To translate the audio into English, simply check the `Translate to English` checkbox before generating the transcription, as shown in the video below.

<!-- french-to-english.mp4 -->
<!-- spanish-to-english.mp4 -->
https://github.com/HenestrosaDev/audiotext/assets/60482743/0aeeaa17-432f-445c-b29a-d76839be489b

> [!NOTE]
>Video from v2.1.0

However, there is another unofficial way to translate audio into any supported language by setting the `Audio language` to the target translation language. For example, if the audio is in English and you want to translate it into Spanish, you would set the `Audio language` to "Spanish".

Here is a practical example using the microphone:

<!-- english-to-spanish.mp4 -->
<!-- english-to-french.mp4 -->
https://github.com/HenestrosaDev/audiotext/assets/60482743/b346290f-4654-48c4-bf5a-2dcb75b136e9

> [!NOTE]
>Video from v2.1.0

Make sure to double-check the generated translations.

#### Generate Subtitles
Expand Down Expand Up @@ -725,9 +797,9 @@ Remember that **WhisperX** provides fast, unlimited audio transcription that sup
- [x] Add subtitle options.
- [x] Add advanced options for **WhisperX**.
- [x] Add the option to transcribe YouTube videos.
- [ ] Add checkbox to automatically save the generated transcription [(#17)](https://github.com/HenestrosaDev/audiotext/issues/17).
- [x] Add checkbox to automatically save the generated transcription [(#17)](https://github.com/HenestrosaDev/audiotext/issues/17).
- [x] Allow transcription of multiple files from a directory.
- [ ] Change the "Generate transcription" button to "Cancel transcription" when a transcription is in progress.
- [ ] Allow transcription of multiple files at once [(#17)](https://github.com/HenestrosaDev/audiotext/issues/17).
- [ ] Generate executables for macOS and Linux.
- [ ] Add pre-commit config for using `Black`, `isort`, and `mypy`.
- [ ] Add tests.
Expand Down
Binary file added docs/dark/from-directory.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/dark/from-file.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/dark/from-microphone.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/dark/from-youtube.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/dark/main.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/light/from-directory.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/light/from-file.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/light/from-microphone.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/light/from-youtube.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/light/main.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/main-system.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file modified docs/videos/english.mp4
Binary file not shown.
Binary file removed docs/videos/french-to-english.mp4
Binary file not shown.
Binary file added docs/videos/spanish-to-english.mp4
Binary file not shown.
2 changes: 2 additions & 0 deletions src/views/main_window.py
Original file line number Diff line number Diff line change
Expand Up @@ -753,6 +753,7 @@ def _on_transcribe_from_change(self, option: str):

if self._transcribe_from_source != AudioSource.DIRECTORY:
self.chk_autosave.configure(state=ctk.NORMAL)
self.btn_save.configure(state=ctk.NORMAL)

if self._transcribe_from_source in [AudioSource.FILE, AudioSource.DIRECTORY]:
self.btn_main_action.configure(text="Generate transcription")
Expand All @@ -764,6 +765,7 @@ def _on_transcribe_from_change(self, option: str):
self.chk_autosave.select()
self.chk_autosave.configure(state=ctk.DISABLED)
self.chk_overwrite_files.configure(state=ctk.NORMAL)
self.btn_save.configure(state=ctk.DISABLED)

elif self._transcribe_from_source == AudioSource.MIC:
self.btn_main_action.configure(text="Start recording")
Expand Down