Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

Commit

Permalink
Merge pull request #13 from hayabhay/dev
Browse files Browse the repository at this point in the history
Rewrote app to enable saving, browsing & searching transcriptions.
  • Loading branch information
hayabhay committed Feb 5, 2023
2 parents efc2e50 + cde9a26 commit 03ff4e0
Show file tree
Hide file tree
Showing 17 changed files with 902 additions and 325 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Custom ignore
local/
data/

### Python template
# Byte-compiled / optimized / DLL files
Expand Down
14 changes: 7 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ default_stages: [commit]
repos:
# Common pre-commit hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
rev: v4.4.0
hooks:
- id: check-added-large-files
args: ['--maxkb=20000']
Expand All @@ -27,7 +27,7 @@ repos:
- id: trailing-whitespace

- repo: https://github.com/Lucas-C/pre-commit-hooks
rev: v1.3.1
rev: v1.4.2
hooks:
- id: forbid-crlf
name: CRLF end-lines checker
Expand All @@ -41,7 +41,7 @@ repos:
language: python

- repo: https://github.com/hadialqattan/pycln
rev: v2.1.1
rev: v2.1.3
hooks:
- id: pycln
name: pycln
Expand All @@ -62,7 +62,7 @@ repos:
language: system

- repo: https://github.com/mwouts/jupytext
rev: v1.14.1
rev: v1.14.4
hooks:
- id: jupytext
name: jupytext
Expand All @@ -75,7 +75,7 @@ repos:
- black==22.1.0

- repo: https://github.com/psf/black
rev: 22.8.0
rev: 23.1.0
hooks:
- id: black
name: black
Expand All @@ -89,7 +89,7 @@ repos:
- "--line-length=120"

- repo: https://github.com/timothycrosley/isort
rev: 5.10.1
rev: 5.12.0
hooks:
- id: isort
name: isort
Expand All @@ -109,7 +109,7 @@ repos:
- "--trailing-comma"

- repo: https://github.com/PyCQA/flake8
rev: 5.0.4
rev: 6.0.0
hooks:
- id: flake8
name: flake8
Expand Down
113 changes: 0 additions & 113 deletions 01_Transcribe.py

This file was deleted.

29 changes: 29 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@

## Changelog
All notable changes to this project will be documented in this file.

### `v1.0.0a` (2023-02-07)
Since there was some apetite for this, I've rewritten this to make it a tad cleaner with a few additional features based on issues raised and personal preferences.
1. Ability to download entire YouTube playlists and upload multiple files at once
2. Ability browse, filter, and search through saved audio files (For now, this is done with a simple SQLite database & SQLAlchemy ORM)
3. Auto-export of transcriptions in multiple formats (was a feature request)
4. Simple substring based search for transcript segments. This is done with a simple `LIKE` query on the SQLite database.
5. Fully reworked UI with a cleaner layout and more intuitive navigation.
6. Ability to save whisper configurations and reuse to prevent having to re-enter the same parameters every time.
7. Removed the ability to crop audio after download to simplify the codebase. Also, temporarily removed summarization until GPT-3 integration is complete.
### `v0.0.1` (2022-10-17)
Initial release for demand testing ([PR #1](https://github.com/hayabhay/whisper-ui/pull/1)).

Features:
- Ability to process media from youtube & local files
- Whisper transcription
- Basic huggingface integration for summarization


## Roadmap
[Planned]

1. Live Transcription with Whisper - Will [streamlit-webrtc](https://github.com/whitphx/streamlit-webrtc) library. This enables live transcription of audio from a microphone and can be used to take voice notes.
3. CLIP embeddings transcribed text segments + Faiss index for semantic search
2. GPT-3 integration - One approach is to simply allow for an instruct prompt to be entered for a transcript and save results. Will await feedback before implementing.
4. ...
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2022 Abhay Kashyap
Copyright (c) 2023 Abhay Kashyap

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
20 changes: 12 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,33 @@
# Streamlit UI for OpenAI's Whisper transcription & analytics

https://user-images.githubusercontent.com/6735526/196173369-27c5ceec-733a-4928-8acb-17cbc2e77a04.mp4
# Streamlit UI for OpenAI's Whisper

This is a simple [Streamlit UI](https://streamlit.io/) for [OpenAI's Whisper speech-to-text model](https://openai.com/blog/whisper/).
It let's you automatically select media by YouTube URL or select local files & then runs Whisper on them.
Following that, it will display some basic analytics on the transcription.
Feel free to send a PR if you want to add any more analytics or features!
It let's you download and transcribe media from YouTube videos, playlists, or local files.
You can then browse, filter, and search through your saved audio files.
Feel free to raise an issue for bugs or feature requests or send a PR.

https://user-images.githubusercontent.com/6735526/216852681-53b6c3db-3e74-4c86-806f-6f6774a9003a.mp4

## Setup
This was built & tested on Python 3.9 but should also work on Python 3.7+ as with the original [Whisper repo](https://github.com/openai/whisper)).
This was built & tested on Python 3.11 but should also work on Python 3.7+ as with the original [Whisper repo](https://github.com/openai/whisper)).
You'll need to install `ffmpeg` on your system. Then, install the requirements with `pip`.

```
sudo apt install ffmpeg
pip install -r requirements.txt
```
## Usage

Once you're set up, you can run the app with:

```
streamlit run 01_Transcribe.py
streamlit run app/01_🏠_Home.py
```

This will open a new tab in your browser with the app. You can then select a YouTube URL or local file & click "Run Whisper" to run the model on the selected media.

## Changelog
All notable changes to this project alongside potential feature roadmap will be documented [in this file](CHANGELOG.md).

## License
Whisper is licensed under [MIT](https://github.com/openai/whisper/blob/main/LICENSE) while Streamlit is licensed under [Apache 2.0](https://github.com/streamlit/streamlit/blob/develop/LICENSE).
Everything else is licensed under [MIT](https://github.com/hayabhay/whisper-ui/blob/main/LICENSE).

0 comments on commit 03ff4e0

Please sign in to comment.