Conversation
Merging feat/full stack into main
feat: Add CODEOWNERS file and configure branch protection to restrict access to dev branch
…restrict access to dev branch"
…er-merge Revert "feat: Add CODEOWNERS file and configure branch protection to restrict access to dev branch"
Revert "Merging feat/full stack into main"
… access to dev branch
feat: Add CODEOWNERS file and configure branch protection to restrict…
status code polling feature implemented, status text now more concise
fixed unbound variable "uuid" instances, giving error message when there are no more available UUIDs to assign
updated status codes for applicable functions for easier reference
docs: new sub-issue template
docs: new pull_request_template
style: removed version & formatting
Add frontend linting workflow (ESLint, Prettier, Stylelint)
feat: Configure React frontend for GitHub Pages deployment via gh-pag…
ci: Add auto-fix and push logic to frontend linting GitHub Actions wo…
ci: added workflow_dispatch
ci: Add Stylelint configuration file to support CSS linting in CI
fix: duplicate env declaration
refactor: move README back to root
fix: missing env
docs: improve clarity
Merge feat/full stack into main
…ry status codes Added extra status codes functions for other functions besides basic ones Removed in-memory status_codes -> status codes now persist in the csv file
changed syntax to be more readable
combined all gitignore files into one in the root directory
Wrapped applicable functions in CSV_LOCK Fixed crashing illogical code in csv file reading
Feat/polling codes + condensed gitignore
backend endpoints are now working as intended with the exception of "cuda" parameter outputs are now more standardised and predictable, less errors to occur
dockerised backend + tested -> Invalid credentials in Authorization header
updated README updated docker compose updated env
build: backend testing on venv + docker successful
There was a problem hiding this comment.
Summary of Changes
Hello @ThePrunerator, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request focuses on enhancing the project's core audio transcription capabilities by upgrading the backend to support GPU acceleration and implementing a robust job tracking system. Concurrently, the frontend has been simplified and restructured to provide a more focused user experience. Significant improvements to development tooling and documentation have also been made to streamline future contributions.
Highlights
- Backend Job Management: Implemented a robust CSV-based job tracking system for audio transcription, including status updates (processing, completed, error) and standardized UUID handling across API endpoints.
- GPU Acceleration: Configured the backend Docker environment to leverage NVIDIA GPUs for PyTorch-based audio processing, significantly enhancing transcription performance.
- Frontend Simplification & Restructuring: Refactored the frontend by removing
react-router-domand consolidating the user interface into a singleTextInterfacecomponent for streamlined audio upload and transcription display. - Improved Development Workflow & Documentation: Introduced new issue and pull request templates, refined
.gitignorefor better dependency management, added Stylelint for consistent CSS, and provided comprehensiveREADME.mdinstructions for easier project setup and execution.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces significant infrastructure and code changes, enhancing the project's setup, backend job management, and frontend architecture. Key improvements include updated .gitignore and README files, a GPU-enabled Dockerfile for the backend, and a refactored backend API for more robust job tracking via CSV. The frontend has been simplified to a single-page application with new linting and deployment scripts. While these changes generally improve maintainability and correctness, some areas could benefit from further optimization, such as the scalability of CSV operations and the configurability of environment-specific settings.
| uuid="" | ||
| used = set() | ||
|
|
||
| for i in range(10000): | ||
| if i not in used: | ||
| uuid = f"{i:04d}" | ||
| break | ||
| if not os.path.isfile(CSV_FILE): | ||
| with open(CSV_FILE, "w", newline="") as f: | ||
| writer = csv.DictWriter(f, fieldnames=FIELDNAMES) | ||
| writer.writeheader() | ||
|
|
||
| with open(CSV_FILE, "r") as f: | ||
| reader = csv.reader(f) | ||
| for row in reader: | ||
| try: | ||
| if row: | ||
| used.add(int(row[0])) | ||
| except (ValueError,IndexError): | ||
| continue | ||
| for i in range(10000): | ||
| if i not in used: | ||
| uuid = f"{i:04d}" | ||
| break | ||
| if uuid == "": | ||
| timestamp = get_timestamp() | ||
| file_name = file.filename | ||
| logging.error(f"{timestamp}: Error generating UUID for transcription request for file: {file_name}.wav") | ||
| return {"error": "No available UUIDs.", "file_name": file_name} |
There was a problem hiding this comment.
The current UUID generation logic iterates up to a fixed limit of 10,000. If the number of jobs exceeds this, the system will fail to generate a new UUID. To ensure continuous operation and avoid this hard limit, consider generating the next UUID by finding the maximum existing UUID in the CSV and incrementing it. This approach scales better for integer-based IDs.
| uuid="" | |
| used = set() | |
| for i in range(10000): | |
| if i not in used: | |
| uuid = f"{i:04d}" | |
| break | |
| if not os.path.isfile(CSV_FILE): | |
| with open(CSV_FILE, "w", newline="") as f: | |
| writer = csv.DictWriter(f, fieldnames=FIELDNAMES) | |
| writer.writeheader() | |
| with open(CSV_FILE, "r") as f: | |
| reader = csv.reader(f) | |
| for row in reader: | |
| try: | |
| if row: | |
| used.add(int(row[0])) | |
| except (ValueError,IndexError): | |
| continue | |
| for i in range(10000): | |
| if i not in used: | |
| uuid = f"{i:04d}" | |
| break | |
| if uuid == "": | |
| timestamp = get_timestamp() | |
| file_name = file.filename | |
| logging.error(f"{timestamp}: Error generating UUID for transcription request for file: {file_name}.wav") | |
| return {"error": "No available UUIDs.", "file_name": file_name} | |
| uuid="" | |
| max_uuid = -1 | |
| if not os.path.isfile(CSV_FILE): | |
| with open(CSV_FILE, "w", newline="") as f: | |
| writer = csv.DictWriter(f, fieldnames=FIELDNAMES) | |
| writer.writeheader() | |
| with open(CSV_FILE, "r") as f: | |
| reader = csv.reader(f) | |
| for row in reader: | |
| try: | |
| if row and row[0].isdigit(): # Ensure row[0] is a digit string before converting | |
| current_uuid = int(row[0]) | |
| if current_uuid > max_uuid: | |
| max_uuid = current_uuid | |
| except (ValueError,IndexError): | |
| continue | |
| uuid = f"{max_uuid + 1:04d}" |
| def add_job(uuid: str, file_name: str, status_code: str) -> None: | ||
| """ | ||
| Inserts a new job in the CSV. | ||
| Reads all rows, adds the new one, sorts by numeric uuid, | ||
| then rewrites the entire file. | ||
| """ | ||
| with CSV_LOCK: | ||
| rows = [] | ||
| if os.path.isfile(CSV_FILE): | ||
| with open(CSV_FILE, "r", newline="") as f: | ||
| reader = csv.DictReader(f) | ||
| for row in reader: | ||
| rows.append(row) | ||
|
|
||
| rows.append({ | ||
| "uuid": uuid, | ||
| "file_name": file_name, | ||
| "status_code": status_code | ||
| }) | ||
|
|
||
| rows.sort(key=lambda r: int(r["uuid"])) | ||
|
|
||
| with open(CSV_FILE, "w", newline="") as f: | ||
| writer = csv.DictWriter(f, fieldnames=FIELDNAMES) | ||
| writer.writeheader() | ||
| writer.writerows(rows) | ||
|
|
||
|
|
||
| def update_status(uuid: str, new_status: str) -> None: | ||
| """ | ||
| Parses a transcript and returns a dict mapping each speaker to a list of {start, end, text} entries. | ||
| Read the existing CSV, update the status_code for the matching uuid, | ||
| and write out to a temporary file which then replaces the original. | ||
| """ | ||
| with CSV_LOCK: | ||
| dir_name = os.path.dirname(CSV_FILE) or "." | ||
| fd, temp_path = tempfile.mkstemp(dir=dir_name, text=True) | ||
| try: | ||
| with os.fdopen(fd, "w", newline="") as tmpf, open(CSV_FILE, "r", newline="") as csvf: | ||
| reader = csv.DictReader(csvf) | ||
| writer = csv.DictWriter(tmpf, fieldnames=FIELDNAMES) | ||
| writer.writeheader() | ||
|
|
||
| for row in reader: | ||
| if row["uuid"] == uuid: | ||
| row["status_code"] = new_status | ||
| writer.writerow(row) | ||
| os.replace(temp_path, CSV_FILE) | ||
| except Exception: | ||
| os.remove(temp_path) | ||
| raise |
There was a problem hiding this comment.
The add_job and update_status functions rewrite the entire CSV file for each operation. While this ensures data consistency and correctness (especially with the atomic update using temporary files in update_status), it can become inefficient for a large number of job entries. For a small-scale application, this might be acceptable, but for future scalability, consider using a more performant data storage solution (e.g., a simple embedded database like SQLite) if the number of jobs is expected to grow significantly.
| if not os.path.isfile(CSV_FILE): | ||
| with open(CSV_FILE, "w", newline="") as f: | ||
| writer = csv.DictWriter(f, fieldnames=FIELDNAMES) | ||
| writer.writeheader() |
There was a problem hiding this comment.
The logic to check for and create the CSV file if it doesn't exist is duplicated across multiple functions (get_jobs, transcribe, delete_job, get_file_name, get_job_status). This can lead to inconsistencies and makes maintenance harder. Consider centralizing this initialization logic, perhaps in an application startup hook or a dedicated utility function that ensures the CSV file and its header are present before any operations are attempted.
| - audiofiles:/app/audiofiles | ||
| - logs:/app/logs | ||
| - transcripts:/app/transcripts |
There was a problem hiding this comment.
The transcripts volume is mounted to the frontend container. Unless the frontend application directly reads or writes files from the /app/transcripts directory, this volume mount is unnecessary. Frontend applications typically interact with backend APIs to retrieve data, rather than directly accessing shared file system volumes. Consider removing this volume mount from the frontend service if it's not explicitly used, to maintain better separation of concerns.
| # Optional: install express and multer for the upload server | ||
| RUN npm install express multer |
There was a problem hiding this comment.
The frontend/src/components/Transcribe.js component directly sends audio files to the backend's /jobs endpoint. This implies that the frontend itself does not run an upload server. If express and multer are not used by the frontend application, installing them in the Dockerfile adds unnecessary dependencies and increases the image size. Consider removing this line if these packages are not actively utilized by the frontend.
| try { | ||
| const formData = new FormData(); | ||
| formData.append("file", audioFile); | ||
| const response = await fetch("http://localhost:8000/jobs", { |
There was a problem hiding this comment.
The backend API URL http://localhost:8000/jobs is hardcoded. This makes the application difficult to deploy to different environments (e.g., staging, production) without code changes. Consider making API URLs configurable, for example, by using environment variables that can be set during the build process or at runtime.
| const response = await fetch("http://localhost:8000/jobs", { | |
| const response = await fetch(process.env.REACT_APP_BACKEND_URL + "/jobs", { |
No description provided.